Also, take a look at the Jupyter notebook here. You can copy that code and adapt it to your needs. I would explore that API in more detail, because it can simplify the process of generating graphs. If you don’t need to apply too many customizations, the pandas plot interface will probably give you enough options. Additionally, I would try to remember the basic methods such as set_xlabel, set_ylabel, set_title, etc. In this post, you can find several examples where the ticks are being modified. Suggestions For Plot Customizationįor me, the trickiest part of plotting with matplotlib is getting the tick labels to look right. The rest of the code is simple formatting labels and ticks as in the previous example. That means that the top graph will be twice as high as the bottom one. In this case, I’m using a mosaic layout (line 11) to tell matplotlib the names and layout of each subplot. There are a few ways to create subplots like the one above. In this case, I’m creating the plot through the matplotlib API because I find it easier to customize. #- format y tickers manually (it could be done in loop, but I kept getting some problems)Īxes_major_formatter((lambda x,pos: f"%"))Īnd the output will be: Chart of top 10 economies, showing their GDP growth and actual dollar value We can tell pandas to create a separate plot for each series (column) in the dataframe by passing the subplots boolean parameter to the plot function: The resulting dataframe for each country will look like this: Showing dataframe corresponding to USA data from the World Bank Pandas Subplots Finally, the column names were replaced with more human-readable strings. Additionally, I turned the time column into a DateTime column and set it as the index. The code above will return a multi-index pandas dataframe like this: Dataset format as it comes from wbgapi packageįor this exercise, I found it easier to work with individual country dataframes, so I had to split them based on their country codes. I just wanted to share how simple it is to fetch the data. Here are some of the codes:ĭon’t pay too much attention to the undefined variables and objects in the code above. I found the ones I was interested in by searching on Google. The World Bank uses code names for different types of data. The download process is relatively simple. To keep this post manageable, I will only focus on the current top 10 world economies by total GDP, according to this Investopedia article. The World Bank has a lot of data on every country, all the way back to 1960. I think it is very useful for data acquisition. Luckily, I found a Python package that implements the World Bank API to make downloading the data easier.Įxpect a full blog post in the future covering this library in more detail. Now, you could download the data manually, but I prefer a more programmatic way of doing it. The World Bank has several databases to choose from. Dataset For This Postįor this post, I decided to get some data about different countries’ economies and populations. You can find the link to the code at the end of the article. On top of that, I’m showing how you can use subplots to join data series that share a common axis, for example, stock prices and trading volume. I’ll also be covering how to format the tick labels for each axis, and the titles and labels for the plot. Therefore, you can consider this as a follow-up to that post.įirst, I’ll be introducing a very convenient Python library to download global economic data from the World Bank. However, I did not get a chance to go deeper into them. Last week, I wrote a post covering common plots that can be created with these libraries. Today we will explore how to build more complex plots using pandas and matplotlib.
0 Comments
Leave a Reply. |