Dealing with Forex data time zones using Pandas

If you have been trading Forex for a while you might have noticed how your broker has a particular timeframe that often does not match the timeframe of the data sources you have for simulations. To solve this problem you can either perform simulations using the data and then transform your broker’s data to match the reference data timeframe or you can change the back-testing data timeframe to match your brokers data. In any case, doing data timezone manipulations will definitely come in handy when you’re back-testing and trading using Forex data. On today’s post I want to show you how you can easily perform these manipulations using the Pandas library in python which will help ease your data manipulation needs when doing research or Forex trading.

The Forex market is very particular in that it completely lacks a central exchange. This means that data timestamps have no universally correct value – since there is no one who centralizes transactions – and therefore brokers can choose whichever timeframe they consider most convenient (which is most commonly their local timeframe). The same thing can happen with FX data providers. Sometimes your data will be in UTC, other times in a different GMT shift with DST (daylight savings), etc. Converting timeframes is often a difficult task because most of the time it is not simply a matter of adding or subtracting hours but you must also ensure you subtract the correct number of hours depending on whether you’re on or outside daylight savings. This is why automated tools for data processing are so convenient because they can account for all of these things automatically.

The code above uses the Pandas library to convert an FX data file from UTC to the Europe/Madrid timeframe which is basically the GMT +1/+2 timezone. Line 13 loads the data while line 14 perform the time zone change. We use the pytz library to first declare the timezone that the data belongs to – in this case UTC – and we then use the tz_convert function to convert our data to a desired timezone, which in this case is Europe/Madrid. After this you can see that I have also performed a tz_localize(None) which simply removes the timezone localization from the data. I have found this is often useful as if you want to perform any further data manipulations there are several functions that are hostile to the timezone localization introduced by the conversion process. This tz_localize(None) does not undo the change you performed before but simply makes the pandas object no longer “timezone aware”. You can choose from a wide variety of different timezones as shown below.

I would also like to point out that the loading of the data can also be a tricky subject. In this case the data is loaded from a csv and the loaded data contains the timestamps in the first column in a format where the day is given first (like dd/mm/yyyy HH:MM). Pandas will attempt to parse the dates automatically, keeping in mind that the date is the first value given within the timestamp. However if you have a different file format – for example mm/dd/yyyy – you should remove the dayFirst option as this will make Pandas mistakenly assume that the first value is a day value. You can also change the index_col from 0 to another value if your timestamps are not located on the first column.

It is also worth mentioning that the above process for loading data is particularly slow since the pandas library has to guess what the exact data format you have given might be. It may often be quicker to load the data without doing data parsing and then use a function that parses dates according to the specific date format you want. Remember that you can always use the set_index function of a Pandas dataframe to set its index and you can obtain that index using a parsing function that you have tailor made for your data. If you’re working with 1M or tick data this is fundamental since otherwise you might need to wait for hours before files are fully loaded into memory (yes, pandas date parsing can be that slow).

Selection_999(104)

As you may have correctly guesses data manipulations in Forex trading can be complicated but thanks to libraries like Pandas everything can become much easier to do and even tedious things like time zone changes can be carried out very easily. Evenmore things like changing incoming data from a broker to match a desired timezone can become as easy as a few lines of code. If you would like to learn more about modifying data and using it to simulate trading systems – even using a trading framework that can correct a live broker’s data to match any desired timezone –¬†please consider joining¬†Asirikuy.com, a website filled with educational videos, trading systems, development and a sound, honest and transparent approach towards automated trading.strategies

Print Friendly, PDF & Email
You can leave a response, or trackback from your own site.

2 Responses to “Dealing with Forex data time zones using Pandas”

  1. Stephen says:

    Thanks for this article Daniel it was timely for me as I was about to look at converting the Asirikuy data to my broker TZ

    • admin says:

      Thanks for writing. Glad it was useful! Make sure you comment on the forum if you have any additional questions.

Leave a Reply

Subscribe to RSS Feed Follow me on Twitter!
Show Buttons
Hide Buttons