API with Python via Google colab

One of the readers raised the problem that Stata-Python is only available in Stata 16 onwards. Users with older versions would not be able to make use of the convenient API. Here is a variant version from the original guide where I will show you how to download the data via Google Colab, a completely free and online python platform requires minimal setup.

Step 0 - Preamble

  1. Install Google Colab on your browser (preferably Google Chrome) if you haven’t had it.

  2. Setup your folder on google drive. Recommended folder name: Comtrade with two subfolder: code and i_X_CHN

Step 1 - Mount Google Drive locally

from google.colab import drive

My personal trick is to separate the drive mounting code from the rest of the code block, hence you don’t need to re-authorize in every re-run.

Step 2 - Paste the Python code and adjust file directory

  1. Copy and paste everything between Python: and end from the final do-file

  2. Add the following file directory shortcut

root = '/content/drive/MyDrive/Comtrade/i_X_CHN'

In addition, adjust the data export line:


Then you should be done and arrive to this:

from google.colab import drive
import json
import numpy        as  np
import pandas       as  pd
import requests

root = '/content/drive/MyDrive/Comtrade/i_X_CHN'
def Comtrade_Scraper   (ps: int,
                       type: str=   'C',
                       freq: str=   'A',
                       px  : str=  'S2',
                       r   : str= 'all',
                       p   : int=     156,
                       rg  : int=     2,
                       cc  : str= 'AG2'):
    Wrapper for creating URLs to access the Comtrade API

    ps   = year
    base      = 'https://comtrade.un.org/api/get?max=10000'
    url       = f'{base}&type={type}&freq={freq}&px={px}&ps={ps}&r={r}&p={p}&rg={rg}&cc={cc}'

    result    = requests.get(url).json()
    if 'dataset' in result: 
        df        = pd.DataFrame(result['dataset'])
        df        = df.replace({None: np.nan})
        df.columns= [i[:32] for i in df.columns]

        return df

for i in range(2000,2022): Comtrade_Scraper(i)

Output files should look like this:

Some drawbacks:

  • There are some glitches, e.g. 2004 data was not downloaded (file size only 315 bytes)

  • Re-run the code block too many times will hit the request limit (somehow more often than the Stata version)

If you have a Python setup already, e.g. ANOVA or others, you are probably better off using that than Google Colab. It is convenient but it is not without a cost.