What should I master better for professional data science in economics and finance?

First, excuse me for the noob and long question which is probably doesn’t even belong to here, I know there are several question been answered like this out there, but I think this is going to be up-to-date. Stack Overflow deleted my question and redirected me to here.

I study economics and finance on undergraduate level, and to be honest, I am not really into programming so far. However, I must admit it you can't doing really well nowadays without specific softwares and programming languages on economics/finance related fields.

According to my curriculum, I’ve encountered Matlab, some econometrics softwares, and of course MS Office, especially Excel with VBA. I have some shady framework in my mind, and please feel free to correct me if I am wrong. So as I experienced, for numerical calculations and doing the vast majority of math, Matlab, Octave and Mathematica exists. For econometrics, there are professional softwares like eViews, STATA, SPSS or the open source Gretl and Tableau for data visualization. And last, we can use Excel to manage databases.

Long story short, my basic question would be that, are these above the best tools for doing the job ? Or should I switch to more professional tools – like real programming languages - to being better in solving mathematical problems, numerical calculations, econometrics, data science and exquisite, high-quality data visualization? What are the most desirable skills in the data science industry nowadays in economic/financial areas?

I heard that R is a quite trending statistical programming language in these days, and getting better and better each day - I already wrote some functions and visualizations in Rstudio. I also heard that SQL is also a better option to manage really massive data sets instead of Excel, but is SQL able to do every kind of stuff with data what can be done in Excel ? It seems to me Python is generally the number one language for data analysis, it’s flexible and usable on a broad scale. I find Python libraries - such as matplotlib, numpy, pandas, bokeh - extremely attractive. What about Julia , is this going to be the next R in the future ? To be honest, I am also still confused a little bit by such terms like data science, data analysis, data mining, machine learning, big data – are there any serious difference between these phrases?

From above, which one is that I should really focus on and master it ? Keep practicing on popular softwares, or switch to R , Python, Julia, SQL ? Maybe both of them? Again, we are talking about only graduate and undergraduate level of economics and finance, and related jobs. I don’t want to develop serious and complex softwares/applications, just quantitatively analyze stock prices, corporate and economic data, like annual reports, employments, GDP and so on.

Experienced data analysts, please guide me through the confusing forest of data analysis tools. I appreciate every kind of comment.

Topic matlab julia sql python r

Category Data Science


welcome to the forum. I‘m a trained economist, I do a lot of econometrics, and I work in research. My opinion (and it is only opinion of course) is that you should focus on R in the first place and consider learning Python. Both is not so hard after all.

Why? R is for free and it offers a lot of support for econometrics (it is well regarded in the community). It also is well regarded in the ML community and you can work with high spec things such as Keras, LightGBM etc. So with R you can‘t be wrong. Many top research papers in economics are done with R. Stata is good for some things (like Panel), but there are so many things you cannot do with Stata. R is more powerful.

Why Python? Python and R are „similar“ wrt what you can do in econometrics/statistics. But Python offers more flexibility in many aspects (some might disagree but this is how I see it). In essence, Python is a thing that offers you a lot of possibilities on top of working with data. So have a look at it. Can be very beneficial for your career.

Data science is an extremely broad field. But if you focus on econometrics/statistics, many other things become relevant as well in the moment. Examples are: working with „big“ data and getting a structure into data (e.g. from online sources etc). I also think that neural nets and tree based things such as boosting will become more relevant in economics in the future. Actually outside the academic world, these methods are used quite a lot in the context of economic problems.

Final note: If you think that you cannot do both (R and Python) it would still be fine to have some knowledge/experience in both. You grow with your problems. Even if you have only little experience, you can advance fast if you know where/how to start. This can also be good wrt getting a good job.

P.S. For Python there are also some good packages/applications for economists: https://lectures.quantecon.org/py/

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.