Friday, February 13, 2015

Data Science Series

I am writing a series http://biorpy.blogspot.com/. R is a programming language for statistical programming. It is very similar to Python in that it is a scripting language. I contrast it with Python data analysis with modules like numpy and pandas. I am learning R, and since I know Python, I keep on trying to compare. It is best to use Anaconda or some other scientific distribution with most of the modules included. If a module is needed that is not included in Anaconda or from the conda command line utility, it can be easily added using pip program, again from the command line. You have to use the pip program inside Anaconda. For R, after installing the base program, add RStudio and you can add more packages from inside RStudio console.