OZmium Sports Betting and Horse Racing Forums - mode v's mean v's median for times

12th June 2003, 10:34 PM

billivet

Member

Join Date: Jan 1970

Location: aus

Posts: 4

I am dabbling in handicapping and am trying to relate courses by distance class etc. Can anyone help me with a formula for the best way to "average" times for races at a distance and track. The sample size varies greatly and I think mode is probably the way to go but need a formula to take care of skew. Mode=3(median)-2(mean) leaves a little to be desired.
Or am I completely on the wrong "track".

Any advice greatly appreciated

13th June 2003, 07:46 AM

jfc

Member

Join Date: Jan 1970

Location: Sydney

Posts: 402

Both you and Dr Ron are on the wrong track.

I'll explain why later once I find some time to collect my thoughts.

13th June 2003, 09:55 AM

osulldj

Member

Join Date: Jan 1970

Location: Melbourne

Posts: 166

I have established class adjusted standard times for every Australian race track at every distance based on data going back to 1996.

The best approach I have found is to take a trimmed average...and then compare this against the median as a sanity check.

Take the dataset for each distance, take the fastest and slowest 10% of records away and average the rest. Then compare this against the median. If they are close to each other this implies your data is normally distributed which in practical terms it should be if looking across a large enough sample size.

Hope this helps.

13th June 2003, 07:40 PM

billivet

Member

Join Date: Jan 1970

Location: aus

Posts: 4

Thanks osuldj - just what I needed - someboby that has done it and data comes out ok

16th June 2003, 03:02 PM

jfc

Member

Join Date: Jan 1970

Location: Sydney

Posts: 402

osulldj,

You method is flawed in that it doesn't cope with track conditions. So if route X has a significantly different proportion of runs on worse than good tracks than route Y, the resulting computed relationship will be wrong.

Onto why average times don't work, apropos the track record thread.

Typically averages will produce a significant number of situations where superior classes have INFERIOR standard times. That is the quintessence of "not working".

I believe this is because (unlike the USA) horses are ridden so as to loaf as much as possible. i.e. too many slow paced races.

Quote:

On 2003-06-13 09:55, osulldj wrote:
I have established class adjusted standard times for every Australian race track at every distance based on data going back to 1996.

The best approach I have found is to take a trimmed average...and then compare this against the median as a sanity check.

Take the dataset for each distance, take the fastest and slowest 10% of records away and average the rest. Then compare this against the median. If they are close to each other this implies your data is normally distributed which in practical terms it should be if looking across a large enough sample size.

Hope this helps.

16th June 2003, 09:10 PM

osulldj

Member

Join Date: Jan 1970

Location: Melbourne

Posts: 166

Hi jfc

[quote]
On 2003-06-16 15:02, jfc wrote:
osulldj,

You method is flawed in that it doesn't cope with track conditions. So if route X has a significantly different proportion of runs on worse than good tracks than route Y, the resulting computed relationship will be wrong.

*******
I understand your point. However, my method does cope with different track conditions. Only races conducted on a track declared as "good" are considered in determining the trimmed average. So many years of historical information ensures appropriate sample sizes, there are also other professional statistical tests to ensure the validity and reliability of the numbers.

*******

Onto why average times don't work, apropos the track record thread.

Typically averages will produce a significant number of situations where superior classes have INFERIOR standard times. That is the quintessence of "not working".

I believe this is because (unlike the USA) horses are ridden so as to loaf as much as possible. i.e. too many slow paced races.

*******
Again, I understand your point and it's a valid one....but not unsolvable.
Every race in my database for the last 6 years has an associated class value....it far more accurately reflects the class of race than the description given by race clubs. Initially the averages are taken across all races so to provide the trimmed average time and average class value at that track.

It's then done by class value range and what you find is a normal distribution of times. Within a given class there are times ranging from slow for that class to fast...this reflects your point about the influence of pace and also natural variability in ability...I can tell you that in nearly all cases, across a large enough sample the times are normally disbtributed. There are just as many races run at very fast pace which produce times that are rarely repeated as there are races at slow pace that don't provide and accurate indication of the fields ability.

The times at the different class value range and some regression analysis across all tracks and distance provides an indication of a standard time at what I call zero class value. All tracks and distances therefore have a time that can be directly compared.

The standard times at Port Macquarie I use will be much faster than the actual average time run there because the horses that run there are lower class animals. The standard represents a 0 class value race, well beyond the class of horse that races there. This enables me to compare directly figures at Rosehill to Wyong to Port Macquarie etc.

If I run a quick query in my database related to my average speed numbers by some of the class groups I see the following:

2YO HCP Avg = 85
3YO HCP Avg = 95
3YO G1 Avg = 102
Class5 Hcp Avg = 93
Class6 Hcp Avg = 95
R1 HCP Avg = 94
R2 HCP Avg = 96
F&M HCP Avg = 96
F&M G1 Avg = 103
Open HCP Avg = 99
Open G1 HCP Avg = 105

These are not my numbers they are the actual averages for each class group from the last 2 years and they show as I would expect them to, reflecting the difference in class. Better horses run faster.

My own approach also incorporates the influence of early speed on overall time which is very clear when understood. This largely overcomes the objection most have that you can't use speed because pace makes times so variable.

I can honestly say that no expense has been spared in the investment in technology to produce and maintain my own data for every race at every track held around Australia every day.

Does it all make me an automatic winner, of course not. It's one approach that provides information and tools other don't have and don't understand. It's my winning edge. The cost is and continues to be funded from punting winnings at >30% POT for the last 4 years.

So jfc I take your points, they are very valid and someone looking to embark on the speed journey should take note of what you say for their own learning. However, your points are not unsolvable with the right data, technology and concepts. I have proven that, maybe not to you, but most importantly myself and a small group of professional colleagues who share the workload.

So don't take offence if I say your assertion that 'it doesn't work' is plain not true :smile:

17th June 2003, 02:01 PM

jfc

Member

Join Date: Jan 1970

Location: Sydney

Posts: 402

osulldj,

I've just started on your earlier posts and internet site (which you've curiously omitted from your profile). That odyssey will probably take a while, but at first glance your ideas are close to my independent ones about the issues for advanced time and pace rating.

Meanwhile, back to the point of contention.

I claimed that "AVERAGE times don't work". NOT that "times don't work".

Your process does much more than merely average times.

Now, as briefly and simply as possible, here's a technique which does NOT use averages, but I believe is more precise.

A mere 29 comparisons may not seem much, but it's nearly the best available for any track. This one is for same meeting 1200m races at Kembla between Maiden and Open handicaps.

The median data is:

70.85 Maiden
69.63 Open

Difference = 1.22 seconds
Ratio = 1.017521

Call these Opens "standard" and convert these Maidens to "standard" by dividing by 1.017521.

This now allows us a much improved 46 comparisons between Class 2 HCP and Standard.

The median difference for Class 2 is 0.76 seconds slower than Standard.

Continue in this way to build a precise class ladder.

Then once the class factor has been removed select the time (say) 10% away from the fastest one to use as a standard 1200m time for Kembla Opens.

In my reduced example above the times read:

68.18 Record (appears legitimate)
69.33 10% from record
70.31 Median
70.47 Average

The reasoning behind all this is:

jfc

[ This Message was edited by: jfc on 2003-06-17 14:05 ]

[ This Message was edited by: jfc on 2003-06-17 14:06 ]

17th June 2003, 11:46 PM

La Mer

Member

Join Date: Jan 1970

Posts: 578

Interesting discussion guys - and one that indicates that there are a number of roads to Damascus.

While not necessarily disagreeing with JFC, think that osulldj is closer to the the Holy Grail (if one exists, which probably doesn't).

I'm not sure if I've fully understood what JFC wrote in his last post on this theme, but using his 1200m Kembla Grange examples and based on many hundereds of races over the last eight years or so, the following are my own standard times for this track (I should add that like osulldj, I also use the median time as a reality test):

Good Track Conditions:
Maidens: 71.24 (71.23)
Class 1: 70.78 (70.70)
Class 2: 70.83 (70.91)
Class 6: 70.72 (70.81)
Open: 70.02 (70.01)
Median times in brackets

The above type of skew in my experience is common - maidens return the worst averages with the class 1's to 6's being somewhat all over the show - track to track Australia wide.

So while I accept some of what you've stated re the classes, I disagree with both of you in some aspects.

JFC raised the very valid issue of different track conditions but this can be overcome to a degree by the establishment of a daily track variant.

Even then, the times on the day will be affected by the classes of both the races and horses completing on any given day but these issues can be handled.

Another issue raised by JFC was slowly run races, at least races where the early pace was slow.

This too can be overcome by the establisment of both early and late standard sectional times - I use such a method to great success.

As an example, while the overall standard time for Open Handicaps at Kembla Grange is 70.02s, a more accurate time for those that are run where the early pace is Ok is 69.81, over a length quicker.

These more accurate standard times (which should be termed as benchmark times) can be assessed by only using races where the early pace is faster than the early sectional standard time, which in turn can be assessed in a similar manner to how osulldj works out his overall standard times.

[ This Message was edited by: La Mer on 2003-06-17 23:48 ]

[ This Message was edited by: La Mer on 2003-06-17 23:50 ]

18th June 2003, 08:41 AM

osulldj

Member

Join Date: Jan 1970

Location: Melbourne

Posts: 166

Hi La Mer,

You make good sense.

* There is no such thing as a holy grail...but there is such a thing as tools and processes that provide information that isn't maintstream which used intelligently provides a winning advantage :smile:

* A daily track variant does cater for differences in track conditions.

* You said: ".....can be overcome by the establisment of both early and late standard sectional times - I use such a method to great success."

Couldn't agree more and this is a feature of my information systems for every track in Australia that records sectional times.

#10

18th June 2003, 11:48 AM

jfc

Member

Join Date: Jan 1970

Location: Sydney

Posts: 402

I appreciate these comments and the obvious time taken to make them, but feel compelled to clarify my position. I am well aware of factors such as as track variance, sectionals, early speed and endorse their proper use.

However I was trying to keep the discussion as simple as possible, partly for the benefit of Dr Ron and billivet who are just starting out on this exercise.

Now, I have only one issue with osulldj and La Mer and perhaps the rest of humankind, namely:

Typically it's wrong to make time comparisons from different meetings. And that's what averaging is, in a roundabout way. So averaging times is wrong.

Official track readings are notoriously untrustworthy so there could be even 1 second's difference between two allegedly good tracks. Penetrometers also appear fishy, perhaps because those readings are performed many hours before the races start.

Therefore I only make comparisons between races at the same meeting.

So in the earlier Kembla example, there were 43 Opens and 167 Maidens available, but I was only prepared to compare the 29 matches on identical days.

By contrast I assume osulldj would average the most middle of the road 35 Opens and 133 Maidens.

---

La Mer's independent data is very useful. I note he calculates the difference between 1200m Maidens and Opens as 1.22 seconds. Precisely the same difference I found using my different techniques.

However the Class versus Open differences are 0.81 versus 0.76. Or ~half a length. Not too bad considering timing to 100th's a second only started there in 1997.

It's also fascinating that he finds that the superior Class 2's have inferior times to Class 1's. Confirming my earlier claim.

But to my amazement I note that my technique finds Class 1 races are 0.88 inferior to Opens, hence 0.12 inferior to Class 2. Which is the way you'd want it to be. (Although untrimmed averages also get a 0.08 difference in the right direction).

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

	To advertise on these forums, e-mail us.