An attempt to gather additional data and uncover new information for European Championship tournaments. The two biggest things being aggregation of pitch-by-pitch results to gauge hitter and pitcher effectiveness and putting together the very spotty data on batted balls’ types to make at least some guesses on quality of fielders, team defenses, and batter’s and pitcher’s contributions.
Data limitations and additional details
(specifics for the 2014 tournament to be added...)
Due to limitations of the software used at CEB tournaments (TAS Baseball) and outright omission of some very important data while scoring games as well, the official source is far from perfect. So we have to make the most sense out of what is available.
Notably:
1) Pitch-by-pitch results data – balls, fouls, etc. – doesn’t differentiate for bunt attempts. Hence, bunts and non-bunts have to be bundled together.
2) Original HTML game logs do not contain pitch-by-pitch information for unfinished plate appearances when the third out is made on the bases, so overall pitch totals are going to be slightly off
3) No data on batted ball’s types (grounder, liner, fly, etc.), hit direction and strength being kept – meaning no data on batted balls resulting in base hits, which is a huge omission - except for very generic descriptions of base hits like a “single through the left side” or a “single to left field”, etc.
The main goal here is to just try and divide all hits into two groups: groundballs and airballs.
Unfortunately, there’s no way to differentiate between line drive hits and fly ball hits – hence, only airballs or non-grounders.
Depending on available descriptions of base hits, the following assumptions are made:
a) infield hits and hits “through the infield” are designated as grounders;
b) singles and doubles “down the lf and rf foul lines” are also counted as grounders (big assumption);
c) all base hits “to the outfield” are considered as airballs and scored as flyballs (they could’ve been marked as line drives just as well).