{"id":372,"date":"2020-06-02T15:43:53","date_gmt":"2020-06-02T19:43:53","guid":{"rendered":"http:\/\/www.7bsoftware.com\/clients\/?p=372"},"modified":"2020-09-09T10:42:58","modified_gmt":"2020-09-09T14:42:58","slug":"is-data-science","status":"publish","type":"post","link":"https:\/\/www.7bsoftware.com\/clients\/2020\/06\/02\/is-data-science\/","title":{"rendered":"Is Data Science?"},"content":{"rendered":"<p>&nbsp;<\/p>\n<p>Let\u2019s consider the idea that the whole Data Science thing might be a bit overhyped.\u00a0 After all, any time there\u2019s this much attention given to any subject, you have to wonder how much of it is just flimflammery.<\/p>\n<p>With this mindset, we can attack the main question, which is\u2026is data science?\u00a0 In other words, are those of us engaged in this practice actual scientists?<\/p>\n<h3>What\u2019s the Experiment?<\/h3>\n<p>We certainly can\u2019t call ourselves scientists unless we\u2019re experimenting on something.\u00a0 However, as we consider the experimentation part, we see a glimmer of hope that maybe there\u2019s some science going on.\u00a0 After all, any programmer who has managed to get somebody to pay for her work knows that the actual job has a lot more to do with finding and fixing bugs than with writing code.<\/p>\n<p>And here we see that there is some experimentation going on.\u00a0 A bug is really best understood as an unplanned event.\u00a0 And in particular, in my experience, it is rare for a bug to be unsurprising.<\/p>\n<p>So what we have here is the need to classify a surprising event, form a concept of what is wrong about our assumptions that would lead to the event, gathering evidence to support that concept (or disproving it), and finally reproducing our work (in the form of a program change).<\/p>\n<p>Now, I\u2019m not claiming that bugfixing is actually science.\u00a0 But it is interesting to see how it contains some basic scientific structure.<\/p>\n<h3>Congratulations, Data Scientist.\u00a0 Here\u2019s your shovel<\/h3>\n<p>From what I\u2019ve experienced, the \u201creal\u201d Data Scientist work has a lot more to do with doing data calls, figuring out the semantics of the data (now, just what\u00a0<em>does\u00a0<\/em>that column mean exactly?), merging data that originates from different sources (and usually different data modelers), dealing with unexpected nulls and duplications, and generally creating order out of chaos.<\/p>\n<p>This has to happen before you can even begin to do the fun stuff, like generating labels or running neural nets on the result.<\/p>\n<p>Apologies for those of you who thought this work was glamorous.<\/p>\n<h3>Are We Sciencing Yet?<\/h3>\n<p>So what exactly do those messy sets of data have in them?<\/p>\n<p>Real world events, that\u2019s what.\u00a0 Or at least references to them.<\/p>\n<p>And what are we doing when we merge \u2019em and clean \u2019em?<\/p>\n<p>We\u2019re creating a new way of looking at the events that make sense to us, so we can answer some previously unknown question about them.<\/p>\n<p>So\u2026is this really different from theorizing and experimenting in the real world?<\/p>\n<p>No, it isn\u2019t.<\/p>\n<p>In fact, the similarities to \u201creal\u201d science go beyond that.\u00a0 In \u201creal\u201d science, nothing is ever proven; all \u201cscientific facts\u201d are contingent, and we have to be humble enough to question the facts when any counterexample presents itself.<\/p>\n<p>For data science (and for the downstream AI we might run on the data sets we create), the same thing is true.\u00a0 We\u2019ve got some notion that there might be a useful, previously unknown order in the data.\u00a0 And we might run an AI experiment on the data, and it might give us a positive result.\u00a0 And the positive result might still be wrong if we made an incorrect assumption somewhere.<\/p>\n<p>Or maybe there was never anything useful there to begin with, and we were completely fooling ourselves all along.<\/p>\n<p>Yep, sounds like science to me.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; Let\u2019s consider the idea that the whole Data Science thing might be a bit overhyped.\u00a0 After all, any time there\u2019s this much attention given to any subject, you have to wonder how much of it is just flimflammery. With this mindset, we can attack the main question, which is\u2026is data science?\u00a0 In other words, &bull;  <a class=\"read-more\" href=\"https:\/\/www.7bsoftware.com\/clients\/2020\/06\/02\/is-data-science\/\"> Read More &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":378,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-372","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/posts\/372","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/comments?post=372"}],"version-history":[{"count":10,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/posts\/372\/revisions"}],"predecessor-version":[{"id":384,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/posts\/372\/revisions\/384"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/media\/378"}],"wp:attachment":[{"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/media?parent=372"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/categories?post=372"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.7bsoftware.com\/clients\/wp-json\/wp\/v2\/tags?post=372"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}