Automatically downloading and saving 1M from FXDD using Python

It is very important to get access to trading data as data is the life and blood of the trading business. However there aren’t a lot of free Forex data sources out there and the sources that are available for free rarely allow for the automatic downloading of trading data. Today I am going to be showing you how you can use one of these sources – the 1M data from FXDD – to automatically download currency data for 15 currency pairs using Python. I will also talk about some of the issues with the FXDD data, how they can be potentially addressed and what coding changes you would require to access the two different data repositories available from the FXDD website.

#!/usr/bin/python
import struct
from time import sleep
import time
import pandas as pd
import datetime
import argparse
import subprocess
import wget

#
# Daniel Fernandez 2016
# http://mechanicalforex.com
# https://asirikuy.com
#

def read_data(symbol, filetype):

    url = "http://tools.fxdd.com/tools/M1Data/"+symbol+".zip"
        
    print "Downloading file {}".format(url)
    file_name = wget.download(url)
    print "unzipping new data"
    
    subprocess.call("unzip -o "+symbol+".zip", shell=True)
    filename = symbol+".hst"
        
    read = 0
    openTime = []
    openPrice = []
    lowPrice = []
    highPrice = []
    closePrice = []
    volume = []

    with open(filename, 'rb') as f:
    
        while True:
            
            if read >= 148:
            
                if filetype == "old":
                    buf = f.read(44)
                    read += 44        
                         
                if filetype == "new":
                    buf = f.read(60)
                    read += 60
                    
                if not buf:
                    break
                    
                if filetype == "old":
                    bar = struct.unpack("<iddddd", buf)
                if filetype == "new":
                    bar = struct.unpack("<Qddddqiq", buf)
                    
                timeToConvert = bar[0]
                month = int(time.strftime("%m", time.gmtime(timeToConvert)))
                year = int(time.strftime("%Y", time.gmtime(timeToConvert)))
                day = int(time.strftime("%d", time.gmtime(timeToConvert)))                         
                                   
                openTime.append(time.strftime("%Y-%m-%d %H:%M", time.gmtime(timeToConvert)))
                         
                openPrice.append(bar[1])
                
                if float(bar[2]) > float(bar[3]):
                    highPrice.append(bar[2])
                    lowPrice.append(bar[3])
                else:
                    highPrice.append(bar[3])
                    lowPrice.append(bar[2])
                           
                closePrice.append(bar[4])
                            
                volume.append(bar[5])              
            else:           
                buf = f.read(148)
                read += 148
                
    data = {'openTime':openTime, 'open':openPrice,'high':highPrice,'low':lowPrice,'close':closePrice,'volume':volume}
     
    result = pd.DataFrame.from_dict(data)
    result = result.set_index('openTime')
    print result
     
    result.to_csv(symbol+'_1.csv', header = False)  
    
def main():
    
    symbols = ["EURUSD", "USDCHF", "EURJPY", "USDJPY", "GBPUSD", "AUDUSD","AUDJPY","EURAUD","USDCAD","GBPJPY","EURGBP","CHFJPY","GBPCHF", "AUDCAD", "CADJPY"]
    
    for symbol in symbols:
        read_data(symbol, "old")
        
        
##################################
###           MAIN           ####
##################################

if __name__ == "__main__": main()

The broker FXDD is one of the few that makes their 1M data publicly available for download outside of their MT4 platform (which is very convenient as data updating, exporting and importing in MT4 is super slow). The data can be downloaded manually from this page where you can download zip files that in turn contain hst files that you can load into your MT4 platform. However if you want to perform some different analysis or if you want to carry out some data processing before hand this format is really not that useful. It is also fairly inconvenient to have to manually download tons of files from a website and then have to manually unzip and load each one.

The above script takes care of all the above by automatically downloading the FXDD 1M data using python. The program downloads data for 15 different symbols and unzips, processes the hst binary files into pandas dataframes and finally saves the data into a human readable csv format that you can use for data auditing or further modifications before using the data for back-testing or other analysis. You can simply delete the downloaded files and rerun the script to reprocess the data and automatically update your data repository from FXDD every week. The data is usually updated by them on time after market close each Friday. You’ll notice that the data is downloaded from the Metatrader repository although they have another repository called Metatrader Xtreme that contains less data but more symbols. To use this you should modify the above script by changing the symbol names (turn EURUSD into EURUSDX) and change the download URL to add an X (http://tools.fxdd.com/tools/M1DataX/).

Selection_999(127)

One of the issues with the FXDD data is that the specifics of the GMT and DST of the data are not mentioned within the page. Through some analysis I have guessed that the data is GMT +2/+3 but I am unsure of whether this DST is constant through the entire data or changes through some years. If you want to use this data then it is very important that you check the DST of each year to ensure that you know exactly how the GMT changes with time. For this you can use a script I previously published that uses the NFP release for this purpose. Alternatively you may use reference points on a different data set to locate the GMT offsets at different points during the year. Figuring out this is extremely important as if the GMT/DST of the data is unknown it will lead to bad trading results.

Another point is the missing data within the FXDD series. About 1.32% of the 1M bars are missing in the EURUSD series and most of these missing bars are located within the months of April and May. It is also important to point out that the majority of the missing data is located in 2015 and 2016 with 2005 and 2006 following as the third and fourth years with the most absent data. Although 1.32% is not a lot of the data it can make a substantial difference depending on whether the missing bars are huge gaps or just normal points of missing data due to lack of trading — for example during the Asian session. On the good side there are no duplicate values within the data and there are also no badly formatted bars. If you want to perform a deeper analysis you can modify the script found here to take the csv file format from the script within this post.

Selection_999(126)

In the end the FXDD repository has more than 10 years of 1M data, it can be accessed for free and it can be downloaded automatically but this does not mean that it should be used without care. The data has significant holes and the GMT/DST is unknown. If you want to use this data you therefore need to have a plan, either by it’s use for the creation of systems that would be seriously affected by these issues or using other data sets to attempt to correct these problems. At the very least you should ensure you find out the GMT/DST of each year to ensure that you can properly design strategies. If you would like to learn more about historical data processing and how you too can evaluate data quality please consider joining Asirikuy.com, a website filled with educational videos, trading systems, development and a sound, honest and transparent approach towards automated trading.strategies.

You can skip to the end and leave a response. Pinging is currently not allowed.

Leave a Reply

Subscribe to RSS Feed Follow me on Twitter!
Show Buttons
Hide Buttons