When you browse for freely available data for the Forex market online you will often find that this data is in the obscure HST binary format for the MT4 platform. Using data that is in this format is quite painful as you often have to convert it into a format you can work with (like a csv) using the history center functionality from the Metatrader 4 platform. However doing this is highly inconvenient since the MT4 program is a bloated application that does not allow for the batch processing of history files and also has some serious memory limitations due to the nature of its code. Today we are going to learn how to easily perform the conversion from hst to csv without having to go into the pain of using the Metatrader 4 platform. We are also going to learn some surprising things about what the MT4 platform actually does when you import and then export an hst file.
–
#!/usr/bin/python # Coded by Daniel Fernandez # mechanicalForex.com, asirikuy.com 2015 import struct from time import sleep import time import pandas as pd import datetime import argparse HEADER_SIZE = 148 OLD_FILE_STRUCTURE_SIZE = 44 NEW_FILE_STRUCTURE_SIZE = 60 def main(): parser = argparse.ArgumentParser() parser.add_argument('-f', '--filename') parser.add_argument('-ty', '--filetype') args = parser.parse_args() filename = args.filename filetype = args.filetype if filename == None: print "Enter a valid filename (-f)" quit() if filetype != "new" and filetype != "old": print "Enter a valid filetype (valid options are old and new)" quit() read = 0 openTime = [] openPrice = [] lowPrice = [] highPrice = [] closePrice = [] volume = [] with open(filename, 'rb') as f: while True: if read >= HEADER_SIZE: if filetype == "old": buf = f.read(OLD_FILE_STRUCTURE_SIZE) read += OLD_FILE_STRUCTURE_SIZE if filetype == "new": buf = f.read(NEW_FILE_STRUCTURE_SIZE) read += NEW_FILE_STRUCTURE_SIZE if not buf: break if filetype == "old": bar = struct.unpack("<iddddd", buf) openTime.append(time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(bar[0]))) openPrice.append(bar[1]) highPrice.append(bar[3]) lowPrice.append(bar[2]) closePrice.append(bar[4]) volume.append(bar[5]) if filetype == "new": bar = struct.unpack("<Qddddqiq", buf) openTime.append(time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(bar[0]))) openPrice.append(bar[1]) highPrice.append(bar[2]) lowPrice.append(bar[3]) closePrice.append(bar[4]) volume.append(bar[5]) else: buf = f.read(HEADER_SIZE) read += HEADER_SIZE data = {'0_openTime':openTime, '1_open':openPrice,'2_high':highPrice,'3_low':lowPrice,'4_close':closePrice,'5_volume':volume} result = pd.DataFrame.from_dict(data) result = result.set_index('0_openTime') print result result.to_csv(filename+'_.csv', header = False) ################################## ### MAIN #### ################################## if __name__ == "__main__": main()
–
I was quite surprised when a few days ago I wanted to see the differences between my historical data and the 1M historical data from FXDD (which is available here for free) but I could not find a simple way to convert hst to csv files online. After suffering from several MT4 platform crashes and taking more than 5 minutes to convert a single file I was done with it and decided to code my own implementation for the conversion of the MT4. This is how I came up with the script you see above which you can easily use to make this conversion whenever you need to. I will now show you how to use this script and talk a bit about the hst file structure.
The hst binary file format is actually not a single file format. There are two versions of these binary type (see here) depending on whether the data was generated for an old MT4 platform or a new MT4 platform. Both versions retain a 148 byte header but the old format has the OHLC data within a 44 byte structure while the new data has it in a 60 byte structure. The difference is because the new data contains additional fields for spread and real_volume which are quite handy if you want to have more accuracy in the data that you store. When using the script you can convert any hst file by pointing at the file and whether the file is in the “new” or “old” format. For example to convert the 1M data from FXDD I used the command “python hst_to_csv_converter.py -f “EURUSD.hst” -ty old -tf 1″. Since this FXDD data comes in the old format I specified the keyword “old”. After you do this the script will generate a pandas dataframe which will then go into a csv file with the default pandas formatting.
–
–
It is easy to see how this script can be easily used to process an entire batch of hst files to completely rid yourself of the MT4 platform for the purpose of hst file reading and exporting. Once you have this script you have no need to Import/Export data in MT4 but you simply need the hst file and you can obtain a clean csv with the whole data series in human readable format. The data you get is exactly the same as the data you would export from the MT4 history center (although some differences in formatting are indeed present). The script also converts the data into a pandas dataframe before exporting so if you know your way around pandas you can already use this loaded data for anything you want to do without the need to save back to a csv file.
Once you have your data in the above csv format you can also easily analyse it using something like R. The code below loads your data into an R dataframe using the proper datetime format, then changes the column names to match OHLCV, shows you the head of your data and finally performs some simple series plots using the quantmod library. Once you have your data in R you can also perform all sorts of statistical analysis, you can evaluate things as entropy, randomness, etc.
–
library(quantmod) f <- function(x, format) { as.POSIXct(paste0(as.character(x)), format = format) } fxdatatemp <- read.zoo("pathToFile/EURUSD.hst_.csv", sep = ",",format="%Y-%m-%d %H:%M", header=FALSE,index.column=1,FUN=f) eurusd<- xts(fxdatatemp) colnames(eurusd) <- c('open','high','low','close','volume') head(eurusd) chartSeries(eurusd) barChart(eurusd,theme='white.mono',bar.type='hlc')
–
–
Certainly it’s a pleasure when you can process your data as you wish with a wide variety of different tools at your disposal. If you would like to learn more about loading data, comparing data and performing other data analysis tasks using R and python for Forex trading please consider joining Asirikuy.com, a website filled with educational videos, trading systems, development and a sound, honest and transparent approach towards automated trading.
Dear Sir,
I look for such Script a long Time but it will be very kind of you to write about how can someone install this Script in python.
Thanks for writing :o) Just copy the script into a blank text file and save it with a name like “script.py”. You can then call it from a command line with “python script.py -f filename -ft filetype (“old” or “new”)”.
I have zero python programming knowledge:
I entered the following in the command line:
python HST2CSV.py -f GER3010.hst -ty “new”
It gives me the following error:
File “HST2CSV.py”, line 27
print “enter a valid filename (-f)”
HST2CSV = the name I gave the file into which I copied your script as above
GER3010.hst = Its the MT4 hst file which is saved in the same directory from which I run the command in the command line
Your answer just above shows “-ft” but looking at your code I assume it should be “-ty”
Just to confrim, I have not made any changes to your code as shown above. I don’t know if I should replace “–filename” in line 19 with “GER3010.hst” and “–filetype” in line 20 with “new”?
Hello Chris,
That depends on your symbol name and the type of data you have available. If the hst is “new” format then replace that with “new” if it’s old then use “old”. The wrong format should generate invalid data so you can just test and see which of the two works,
Best Regards,
Daniel
Chris,
Did you ever get this to work? I can’t get it to run from the command line. It just keeps saying “can not open file ‘DataGrab.py-f AUDCHF.hst’: [Errno2] No such file or directory” However the file is there and if I run it with out a file name it run’s and asks for the file name.
Thanks in advance
Ryan
Filename is defined using “-f”, as the code shows, the file type is defined by “-ty”. Enter the filename within quotes (python HST2CSV.py -f “GER3010.hst” -ty “new”).
Hi,
I’m new to python (i do most things in VBA in excel). Just wondering how easy (or difficult) it would be to run the python script on a directory with selected HST files and have it convert the whole lot to CSV, so that i can auto import them in Excel ? Reason is i’d like to create my own 28 pair currency correlation table.
Thanks,
Karl
Hi Karl,
Thanks for writing. The script can be easily modified to do this. You can also perform the correlation table analysis in python, much faster than using excel. Let me know if you have other questions,
Best regards,
Daniel
Where is it even looking for the file? I can’t get it to run due to the same error Chris was getting.
Use python 2.7 and pip install pandas, AND use 64bit python if your HST file is huge.
Dear Daniel,
I wish to convert MT5 history into MT4 history. MT5 history is better with no gaps. Kindly guide me.
Hello everyone,
Im new in python and this is the first time I reply in a web but the work from Daniel deserve a reply
First of all thank you Daniel for your code you are amazing:
Let me explain how to work the code I use spyder in python 3
1º put parenthesis in every print statement: lines 27,31,83
print (“Enter a valid filename (-f)”)
instead of:
print “Enter a valid filename (-f)”
2º create two simple lines after line 24 with filename and filetype
in the first one we can write the PATH where your.hst file is allocated, in my case is this one:
filename=’C:/Users/pepe/AppData/Roaming/MetaQuotes/Terminal/50CA3DFB5856475C5A8F28B45846/history/AdmiralMarkets/GBPJPY60.hst’
you have to find yours that is quite easy.
in my case I want GBPJPY60.HST that
3º create filetype=’new’
the final code in my case is:
# Coded by Daniel Fernandez
# mechanicalForex.com, asirikuy.com 2015
import struct
from time import sleep
import time
import pandas as pd
import datetime
import argparse
HEADER_SIZE = 148
OLD_FILE_STRUCTURE_SIZE = 44
NEW_FILE_STRUCTURE_SIZE = 60
def main():
parser = argparse.ArgumentParser()
parser.add_argument(‘-f’, ‘–filename’)
parser.add_argument(‘-ty’, ‘–filetype’)
args = parser.parse_args()
filename = args.filename
filetype = args.filetype
filename=’C:/Users/pepe/AppData/Roaming/MetaQuotes/Terminal/50CA3DFB5856475C5A8F28B45846/history/AdmiralMarkets/GBPJPY60.hst’
filetype=’new’
if filename == None:
print (“Enter a valid filename (-f)”)
quit()
if filetype != “new” and filetype != “old”:
print (“Enter a valid filetype (valid options are old and new)”)
quit()
read = 0
openTime = []
openPrice = []
lowPrice = []
highPrice = []
closePrice = []
volume = []
with open(filename, ‘rb’) as f:
while True:
if read >= HEADER_SIZE:
if filetype == “old”:
buf = f.read(OLD_FILE_STRUCTURE_SIZE)
read += OLD_FILE_STRUCTURE_SIZE
if filetype == “new”:
buf = f.read(NEW_FILE_STRUCTURE_SIZE)
read += NEW_FILE_STRUCTURE_SIZE
if not buf:
break
if filetype == “old”:
bar = struct.unpack(“<iddddd", buf)
openTime.append(time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(bar[0])))
openPrice.append(bar[1])
highPrice.append(bar[3])
lowPrice.append(bar[2])
closePrice.append(bar[4])
volume.append(bar[5])
if filetype == "new":
bar = struct.unpack("<Qddddqiq", buf)
openTime.append(time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(bar[0])))
openPrice.append(bar[1])
highPrice.append(bar[2])
lowPrice.append(bar[3])
closePrice.append(bar[4])
volume.append(bar[5])
else:
buf = f.read(HEADER_SIZE)
read += HEADER_SIZE
data = {'0_openTime':openTime, '1_open':openPrice,'2_high':highPrice,'3_low':lowPrice,'4_close':closePrice,'5_volume':volume}
result = pd.DataFrame.from_dict(data)
result = result.set_index('0_openTime')
print (result)
result.to_csv(filename+'_.csv', header = False)
##################################
### MAIN ####
##################################
if __name__ == "__main__": main()
Kind regards,
mitillo