Why and what: R Programming Language
If you’ve spent any amount of time on the internet, you’ve seen countless debates on the best programming languages for beginners. Depending on the task at hand, you may have good reasons to start with a language with Ruby, Julia, Go or a more domain-specific language like Swift or Kotlin. In fact, as you progress as a developer, you will inevitably come across these new technologies, and expand your repertoire with the ever-increasing list of modern tools and innovation.
But what’s the deal with R?
R is a language designed for data manipulation, data analysis, visualization and today, along with Python, the two main languages used in machine learning and across the full spectrum of data science work. When you learn R, you’re learning a coherent, integrated collection of tools carefully designed to make you productive across the full spectrum of data analysis task.
At Airbnb, R has been amongst the most popular tools for doing data science in many different contexts, including generating product insights, interpreting experi- ments, and building predictive models. Airbnb supports R usage by creating internal R tools and by creating a community of R users.
How R Helps Airbnb Make the Most of Its Data, 2017
Airbnb’s data science team relies on R every day to make sense of our data. While many of our teammates use Python, R is the most commonly used tool for data analysis at Airbnb.
In fact, by learning R Programming, you put yourself in good company, It is a staple in all sort of data science work, and as a Harvard Business Review article puts it in bold, capitalized font, “The Sexiest Job of the 21st Century“. When the AI engineers at Facebook’s Core Data Science team released an open source forecasting tool, known as “Prophet”, the tool is made available in R and Python. Not in Matlab, Julia, C, Octave or Java. In R and in Python.
When Twitter’s Engineering team released its “practical and robust anomaly detection” program, they made it available as an open source R package first (a Python package was released later). The same pattern for their other open source contributions (releasing BreakoutDetection as an R package).
When the bright folks at Google developed a technique to “infer causal impact using Bayesian structural time-series models”, they implemented it and open source it as an R package.
I could go on and give you more examples, from nimble startups, to large corporations like Microsoft (a big supporter of R, so much so that they’ve bought a company developing tools for R, and rebranded it to now be part of its Microsoft Machine Learning Server; Microsoft tools such as SQL Server, PowerBI, and Visual Studio all includes first class support for R, allowing you to write and execute R code).
R is a complete environment for data scientists.
Microsoft
Advantages of R
Not that you need any more convincing, but here’s a quick summary:
- Built by statisticians as a fully integrated scientific environment
- More than 15,000 user-contributed packages on CRAN, including the companies you read in the list above
- Open source
- Used and loved by the largest software companies in the world
- Integrates well with big data tools such as Hadoop (we’ll get into this in a future course)
- Employability and attractive career prospects