|
Methods For Faster Analysis |
Posted by Steve on May-28-2013 18:29 |
|
Peter,
I have a couple of questions pertaining to array math if you wouldn't mind taking a look:
a. I have data arriving a number of times per minute and wanted to know if there is a lower resolution to ArrayMath.selectStartOfHour so that I can aggregate the data into 1 min or other time intervals ? Doesn't seem like there is but this must be a common question ?
b. I plan to use chart director library to perform some calculations for me which will use array math functions. My question is is there a built in way to handle not having to perform a calculation on the entirety of an array when you only need it to be done for the x most recent index positions ? An example of this would be: ArrayMath.movAvg
Thanks in advance,
Steve. |
Re: Methods For Faster Analysis |
Posted by Peter Kwan on May-28-2013 22:54 |
|
Hi Steve,
As ChartDirector is basically a chart drawing program, it does not have many features for data manipulation. It is expected developers would manipulate the data first before passing them to ChartDirector for plotting.
(a) For data aggregation, the most common method is to use a suitable SQL query (eg. a GROUP BY query). The SQL is specifically designed with aggregation features.
The aggregation functions in ChartDirector is mainly designed for aggregating financial data, as the open and close values in financial charts cannot be aggregated using SQL. In financial charts, it is rare to require aggregating at the per minute intervals, so no such feature are buit-into ChartDirector.
If you are not using SQL, you may consider to create a "selectStartOfMinute" (or most of "selectXXX" functions) in your own code. It only takes around 3 lines of code. In "pseudo code", it is like:
for (int i = 1; i < myData.length; ++i)
myData[i] = (minutes_since_epoch(myData[i]) != minutes_since_epoch(myData[i - 1])) ? myData[i] : Chart.NoValue;
(b) If you would like to use ArrayMath for data manipulation for some of your data (instead of all of your data), you may simply use "some of your data" as the input to ArrayMath (instead of passing all of your data to ArrayMath).
Hope this can help.
Regards
Peter Kwan |
Re: Methods For Faster Analysis |
Posted by Steve on May-29-2013 07:24 |
|
Peter Kwan wrote:
Hi Steve,
As ChartDirector is basically a chart drawing program, it does not have many features for data manipulation. It is expected developers would manipulate the data first before passing them to ChartDirector for plotting.
(a) For data aggregation, the most common method is to use a suitable SQL query (eg. a GROUP BY query). The SQL is specifically designed with aggregation features.
The aggregation functions in ChartDirector is mainly designed for aggregating financial data, as the open and close values in financial charts cannot be aggregated using SQL. In financial charts, it is rare to require aggregating at the per minute intervals, so no such feature are buit-into ChartDirector.
If you are not using SQL, you may consider to create a "selectStartOfMinute" (or most of "selectXXX" functions) in your own code. It only takes around 3 lines of code. In "pseudo code", it is like:
for (int i = 1; i < myData.length; ++i)
myData[i] = (minutes_since_epoch(myData[i]) != minutes_since_epoch(myData[i - 1])) ? myData[i] : Chart.NoValue;
(b) If you would like to use ArrayMath for data manipulation for some of your data (instead of all of your data), you may simply use "some of your data" as the input to ArrayMath (instead of passing all of your data to ArrayMath).
Hope this can help.
Regards
Peter Kwan
Peter,
Thanks for your responses. Couple of things:
1. Minute bars are very common in finance and trading. Another very popular timeframe is 15 minutes.
2. Your example code (thanks) - I am a VB6 guy, could you paraphrase that grouping code for me ? Sorry to be a little dumb. Seems that you are running through the array and leaving values where the minute is different from the prior one but for all others assigning the NoValue constant value is that correct ?
3. In relation to your reply for some of the data above, I seem to recall that there is a function which details the lead number of periods for a given indicator or function - is my memory serving me correct ? I can't seem to find it though, could you please advise ? Naturally this would make the specification of exactly the minimum number of periods to use a whole lot easier.
[Note: Yes, I am using the finance chart code if that helps]
Thanks in advance for your continued assistance.
Steve. |
Re: Methods For Faster Analysis |
Posted by Steve on May-29-2013 19:22 |
|
Peter,
Apologies for throwing a barrage of questions at you but I have another one (I have compiled my list of questions here for convenience !):
1. I am doing some playing with finance chart. I have constructed a procedure to group the data into 1 minute intervals. My next question is how do I set the axis to show more than the first time interval ? I note that setDateLabelFormat only goes as low as hourly. The resultant chart looks like the below with just the first minute timestamp showing:
2. In relation to your reply for using some of the data for a calculation of an indicator above, I seem to recall that there is a function which details the lead number of periods for a given indicator or function - is my memory serving me correct ? I can't seem to find it though, could you please advise ? Naturally this would make the specification of exactly the minimum number of periods to use a whole lot easier.
Thanks in advance,
Steve.
|
Re: Methods For Faster Analysis |
Posted by Peter Kwan on May-30-2013 01:59 |
|
Hi Steve,
For financial charts, it is common for the x-axis to show the entire trading day, even if the trading day has just begun. For example, if you look at Yahoo Finance at 11:00am in the morning, you can see the x-axis still show the whole trading day, but the candlesticks or the OHLC bars are only plotted up to 11:00am.
The current design of the FinanceChart is based on similar assumptions - that the chart should span at least a few hours, irrespective of the amount of data. So the built-in code only generates down to hourly labels.
For your case, you may consider to generate the chart similar to common financial chart - with the whole trading day or at least a few hours on the x-axis. To do this, you just need to create a timeStamp array that spans the whole trading day (or a few hours), while your high/low/open/close/vol arrays can be shorter and contain the actual data you have. (If you are aggregate from raw transaction data to the high/low/open/close/vol data, your code should already be synthesizing the timestamps, so you may just extend the code to synthesize more timestamps to cover the whole day.)
If you really need to have more detail labelling, you would need to use custom labelling. To do this, you may use Axis.setLabels to directly set the labels you want to use for the first chart (the priceChart in your case). For example:
Call myFinanceChart.setData(..........)
Set m = myFinanceChart.addMainChart(..........)
Call m.xAxis().setLabels(anArrayOfLabels)
2. Unluckily, there is no function to obtain the amount of lead data required. Note that the amount of lead data are specified by your code. For example, the moving average intervals and various technical indicator parameters are supplied by your code. So your code should already know the lead data required. It just needs to keep track of it.
In the "Interactive Financial Chart" sample code, it just assumes the lead data intervals are the maximum moving average indicating used, or at least 20. We assume that the other technical indicators would not require more than 20 intervals of lead data.
Hope this can help.
Regards
Peter Kwan |
Re: Methods For Faster Analysis |
Posted by Peter Kwan on May-30-2013 01:20 |
|
Hi Steve,
(a) If you already have per minute data (or any regular interval data), you may consider to use the selectRegularSpacing method to aggregate the data. For example, a regularly spacing of 15 can aggregate per minute data into 15-min data.
However, if you data are raw trading data at irregular intevals, then for drawing a financial chart, then you need not only to "aggregate data", but also to "synthesize data". You may already know that in a normal financial chart, the high/low/open/close values should alway exists even if there is no transaction for an interval - the open/high/low/close values should then be equal to the last close value, and the volume should be 0. To synthesized the data, the software must know the stock exchange trading hours, otherwise it is not possible to distinguish the lack of data is due to no transaction during a trading interval, or if the market has closed. So for handling the raw trading data, in addition to aggregation, the algorithm also needs to synthesize data and taking trading hours into account should be used.
(The selectRegularSpacing cannot be used to aggregate into daily data, because a day can be of variable length and not regular length. For example, many stock exchanges in the world can occasionally trade for hour a day for various reasons, so we cannot assume each trading day has the same number of trading minutes. The same applies to weekly, monthly and yearly data.)
(b) The equivalent code in VB6 is:
'Assume myTimestamp is an array of VB Date
For i = UBound(myTimestamp) to 1 Step -1
If Minute(myTimestamp(i)) = Minute(myTimestamp(i - 1)) And DateDiff("s", myTimestamp(i - 1), myTimestamp(i)) < 60 Then myTimeStamp(i) = cd.NoValue
Next
(c) Unluckily, there is no function to obtain the amount of lead data required. Note that the amount of lead data are specified by your code. For example, the moving average intervals and various technical indicator parameters are supplied by your code. So your code should already know the lead data required. It just needs to keep track of it.
In the "Interactive Financial Chart" sample code, it just assumes the lead data intervals are the maximum moving average indicating used, or at least 20. We assume that the other technical indicators would not require more than 20 intervals of lead data.
Hope this can help.
Regards
Peter Kwan |
Re: Methods For Faster Analysis |
Posted by Steve on May-30-2013 07:27 |
|
Peter,
Thanks vm for the continued assistance. Couple of points:
1. I understand what you are saying about normal trading hours. This is for Bitcoin trading and trading hours do not exist.
2. I had tried selectRugular spacing but it made no difference to the chart. Code below:
Dim am As ArrayMath
Set am = cd.ArrayMath(Timestamps)
Timestamps = am.aggregate(cd.CTime(Timestamps), cd.AggregateFirst)
HighData = am.aggregate(HighData, cd.AggregateMax)
LowData = am.aggregate(LowData, cd.AggregateMin)
OpenData = am.aggregate(OpenData, cd.AggregateFirst)
CloseData = am.aggregate(CloseData, cd.AggregateLast)
VolData = am.aggregate(VolData, cd.AggregateSum)
am.selectRegularSpacing 1, 1
Changing the first parameter makes no difference to the chart. [Likewise I have had no luck with trying out am.Shift 5 - the chart appearance is not shifted although the object array elements are (and yes as I am using the aggregate function for the timestamps, there are some NoValues in there) - perhaps the order of when selectRegularSpacing and Shift occurs makes a difference ? ]
3. In terms of 1 minute intervals, the below works:
If Minute(TradeData(n).Datetime) > Minute(TradeData(n - 1).Datetime) Then
Timestamps(Cnt) = TradeData(n).Datetime
Else
Timestamps(Cnt) = cd.NoValue
End If
In relation to this - I take it there is no easy way to do the same as the above for 5 and 15 minute intervals without my coding up something to detect the crossing of a 5 and 15 minute interval (in the same fashion as I am using the Minute function above to detect the crossing of a minute boundary) ?
4. I have had a play with using the array math functions to calculate MACD (ie using code from the FinanceChart source). Code is below:
'// Example call
Dim Signal ' As Variant
Signal = addMACD(100, 15, 31, 14, vbRed, vbBlue, vbBlack)
msgbox Signal(0) & vbtab & signal(1)
'// MACD Function (stripped down version of FinanceChart source code)
Public Function addMACD(height, period1, period2, period3, color, signalColor, divColor) As Variant
Dim expAvg1()
expAvg1 = cd.ArrayMath(m_closeData).expAvg(2# / (period1 + 1)).result()
Dim macd ' As Double
macd = cd.ArrayMath(m_closeData).expAvg(2# / (period2 + 1)).subtract(expAvg1).result()
Dim macdSignal
macdSignal = cd.ArrayMath(macd).expAvg(2# / (period3 + 1)).result()
addMACD = macdSignal
End Function
So, this calculates the MACD (I am using built in random number generator for this) but the first array element (of the returned array - signal) is always 0. I have searched to try to find if ArrayMath arrays are 1 based, thus explaining the zero value but cannot find a freference to the base at all. Could you perhaps shed some light on this Peter ?
Thanks once again for your continued assistance with this.
Steve. |
Re: Methods For Faster Analysis |
Posted by Peter Kwan on May-30-2013 16:14 |
|
Hi Steve,
2. There is an example on how to use selectStartOfWeek and selectStartOfMonth in the Interactive Financial Chart sample code. You just need to change the selectStartOfWeek call to the appropriate selectRegularSpacing.
For example, assume your original data are one value per 15 seconds, and you want to aggregate them into one value per minute. The aggregation interval is there 4 (4 values aggregate to one value). The code should then be:
Dim am As ArrayMath
Set am = cd.ArrayMath(cd.ArrayMath(TimeStamps).selectRegularSpacing(4))
Timestamps = am.aggregate(cd.CTime(Timestamps), cd.AggregateFirst)
HighData = am.aggregate(HighData, cd.AggregateMax)
LowData = am.aggregate(LowData, cd.AggregateMin)
OpenData = am.aggregate(OpenData, cd.AggregateFirst)
CloseData = am.aggregate(CloseData, cd.AggregateLast)
VolData = am.aggregate(VolData, cd.AggregateSum)
3. For 1 minute interval, I suspect there are several issues (or potential issues) in the code "Minute(TradeData(n).Datetime) > Minute(TradeData(n - 1).Datetime)".
(a) The Minute returns a value from 0 to 59. The Minute 0 is not greater than Minute 59, but it is the next minute. (So the next minute is not necessarily larger than the previous minute. It can be smaller, as from 59 to 0.)
(b) If your data are of irregular intevals, or have trading gaps, two minutes can be equal yet they are different minutes. For example, 09:30 is different from 10:30 (just in case there is no trading between 09:31 to 10:29), but in your code, it assmes they are the same minute. Similarly, 16:00 (can be the last minute in a trading day in some stock exchange) will be considered as the same minute as 09:00 (the first minute in the next trading day), but I see this may not affect Bitcoin if it is trading non-stop forever (24 hours a day and 365 days a year).
In my original code, the two minutes are the same if they are the same number, and also if they are less than 60 seconds apart. It catches the case of 09:30 and 10:30 (they are more than 60 seconds apart) and it also handles the 59 to 0 transition.
I must admit I have not tested the code, so I am not sure if there is still bug in it, but this is the reason why it tests for equality (rather than using "greater than" test), and why there is a "DateDiff" there.
4. If you try to calculate the MACD signal by hand (or using a spreadsheet), then you will know why the first value must be 0. It is normal and is not an error.
The current exponential average value depends on the current data value and the previous exponential average value. So it is impossible to compute exponential average at all, as each value depends on the previous value, ad infinitum.
To solve this problem, the exponential average value of the first data value (which has no previous value) is assumed to be the data value itself. It can be mathematically proven that if the sequence is long enough, it will eventually "converge" to the true exponential average.
For the MACD signal, it is the difference between two exponential averages. As the first value of both exponential averages are equal to the first data value itself, so the difference is 0. As the exponential averages are supposed to converge to the true value, so after discarding sufficient lead values, the remaining values should converge to the true MACD signal.
Regards
Peter Kwan |
Re: Methods For Faster Analysis |
Posted by Steve on May-30-2013 18:39 |
|
Peter,
That's great. Thanks very much for your help. |
|