JavaScript is a popular programming language. One that has grown in scope over the last decade. But does it have the tools and features to do Data Science?
JavaScript is the backbone of the Internet. Created in the 90s as a way to manipulate the content of web browsers, it has grown in both popularity and functionality. Becoming one of the most popular programming languages in the world.
With its growing popularity, many wonder if it would be a good option for data scientists and data analysis in general. The answer is not so simple. Let's take a look at the pros and cons of JavaScript as a data science language.
The argument against JavaScript
No data scientist in their right mind would recommend newcomers to learn JavaScript first. Python, R, Scala, and Julia are often cited as THE programming languages for data science. But why does this happen?
David Beazley , author of JavaScript for Data Science, tells us in the first few pages of the book that he originally planned to call his work JavaScript vs. Data Science. Data Science, as a recognition of JavaScript's reputation.
JavaScript is famous for its many quirks, some of which have to do with the way it handles numbers. As a common example, in IEEE 754, the floating point standard NaN (not a number) was introduced as a representation of a value that cannot be presented within the limitations of the numeric type.
In simple terms, divide by zero and JavaScript will return a NaN. Not the most informative or helpful answer. Even worse, the language still recognizes NaN as a number; so if you just check the type you won't catch that pesky rogue.
Honestly, this is a minor inconvenience, but it points to a much broader problem: JavaScript is dynamically typed and has a pretty flexible way of figuring out what a number or a string is. Again, nothing that can't be avoided, but definitely requires some defensive coding.
Dealing with large numbers is also a problem. Not only is JavaScript imprecise when working with large numbers, but the fact that it doesn't support multithreading or parallel processing means you can forget about big data. Neither JavaScript nor Node.js are suitable for computationally intensive, CPU-bound tasks.
Most of these issues can be overcome, but the final nail in the coffin is the opportunity cost. As a data scientist, why would you spend so much time learning JavaScript when you already have a plethora of languages that do it better and with less effort?
Every hour invested in JavaScript is one less hour invested in other languages… but that might not be a bad thing.
The case of JavaScript
Beazley's main argument in his book is that modern JavaScript has addressed many of these questions and that the JavaScript community's interest in data science has grown exponentially in recent years. This has promoted tools and features that make it a competitive choice.
Perhaps the first point in favor of JavaScript is its ease of use and readability. Case in point, if you're reading this on your computer, just press F12 and you'll have a JavaScript console ready to use right away.
JavaScript is very easy to learn, and because of its popularity, there are literally thousands of resources to help you learn its ins and outs. A quick look at StackOverflow' statistics reveals that the amount of information in JavaScript is simply staggering.
Another point in its favor is that more and more companies are using web technologies with a Node-based stack to build their products. If a data scientist is going to work closely with product developers, speaking a common language is definitely an advantage.
Even better, the fact that everyone works with the same technology means that integration with other products and services is easier, requiring less overhead and preparation. Just like it is easier to communicate with someone when everyone speaks the same language.
TypeScript, a superset of JavaScript developed by Microsoft, solves one of the main criticisms against JavaScript, that it is weakly typed. In fact, with TypeScript, the language is more rigid than the Data Scientist's darling, Python. Statically typed languages tend to promote better practices and less buggy code which is why there has been an explosion in TypeScript development companies and services.
And speaking of Microsoft, Napa.js is a fantastic alternative to Node.js if multithreading is your concern. Although it is still in its early stages, and not an ideal solution. This shows how much interest there is in promoting JavaScript as an all-purpose programming language, including for data science. But that is not all…
New tools for data science
One of the most common arguments against JS is that it lacks data science libraries from more robust solutions like R and Python. We absolutely agree with this argument. Even the most ardent Javascript supporter will have to admit that any aspiring data scientist needs another tool in their repertoire.
The data science landscape for JavaScript is growing exponentially. Five years ago, no one would have imagined that TensorFlow would have a working JavaScript library, and yet here we are.
That said, the Javascript data science ecosystem is growing. Consider for example D3.js, a popular library for data visualization that provides a fantastic set of tools for building dashboards, reports, and data stories via the browser.
Another good example is TensorflowJS. For those who don't know, Tensor Flow is one of the most popular machine learning libraries out there. With its JS variant, you can run machine learning algorithms directly in your browser and/or on a Node.js server.
But why would you want to do that? Yes, a browser environment is not the most optimized workspace. But on the other hand, it is very convenient for quick prototypes, small projects and applications that do not require a lot of memory. Why create a virtual environment when a simple browser works perfectly?
The fact that we are acquiring these tools for the language that powers the internet and web applications in general is opening doors to new possibilities. With browser-based data science, we can explore new ways of processing and presenting data in a user-friendly environment.
Imagine having a web app that acts as a presentation for your data story, all coded as a frontend solution with JavaScript, HTML, and CSS. Anyone with a smart device and internet connection can access the results in seconds.
This wave of JavaScript in Data Science points to the fact that the field is expanding. Data scientists are no longer expected to be the person sitting in a dusty corner analyzing data. They are storytellers, who have to find ways to present their results and promote data-driven environments.
Another tool in the toolbox
We may not be ready for a world where knowing JavaScript is enough, but as an auxiliary skill set, it's a perfect resource for data scientists. What's important here is that this probably won't revolutionize the field, but what it will do is increase its scope and reach. A positive result in the end.
Source: BairesDev