Facts, Not Fiction

 
Page 3 of 3 FirstFirst 123
Results 21 to 28 of 28
  1. Collapse Details
     
    #21
    Senior Member
    Join Date
    Oct 2005
    Posts
    2,095
    Quote Originally Posted by Master403 View Post
    One guy's easy-peasy is another guy's not-so-easy-peasy.

    The important point is that whatever Excel tool you use, once you have imported the data, you can extract whatever data amuses you, and spreadsheets can be incredibly powerful.
    Easy Peasy.

    Yes, you are absolutely right. People think differently and use Excel for completely different purposes so learn different subsets of skills. This means they prefer, or only know, their way of doing things.

    I spent a lot of work time standardising methods and structures to maximise maintainability by the lowest common denominator - it's really hard.

    Getting the data.

    Yeah, that's usually the hardest thing, especially as most available data is structured for presentation, not for easy use.

    I normally spend a lot of time checking data and cleaning up errors before I can do any analysis. I since found a couple of probable errors in the 1500 data, so both our numbers are slightly wrong

    Maybe I should start a thread so there's a ready reference for tips and tricks on accessing and using data rather than continuing here?
    Reply With Quote
     

  2. Collapse Details
     
    #22
    Senior Member
    Join Date
    Jul 2016
    Posts
    651
    Quote Originally Posted by El Toro View Post
    Maybe I should start a thread so there's a ready reference for tips and tricks on accessing and using data rather than continuing here?
    Great idea. I have been pulling data from plain text (as in this thread), tables in web pages (Excel has a nice tool), PDFs in web pages, OCR of scans posted as PDFs, and OCR of photos on my phone. I would benefit from almost anything you could post.
    Reply With Quote
     

  3. Collapse Details
    tips on manipulating list data
    #23
    Administrator
    Join Date
    Oct 2005
    Location
    west of Westeros
    Posts
    62,019
    as requested...
    Reply With Quote
     

  4. Collapse Details
     
    #24
    Senior Member
    Join Date
    Jun 2016
    Posts
    1,872
    Quote Originally Posted by Master403 View Post
    No, those are performers. Sort by names (column F in this case). Add a column (in this case L). Put 1 in the first cell of the column (L1). Add a formula in cell 2: =IF(F2=F1,L1+1,1). That will count the number of performances for each performer. Copy L2 into all the cells below it in the column. Select all the cells in column L. Copy, then Paste Special as Values. Sort by 1:Column L, 2:Column B (Rank). 977 performances have a 1 in the last column. Those are the best performances by 977 performers.
    Here's where I am currently "stuck", because I have so little experience using Excel, but I love trying new things.

    "Add a formula in cell 2"


    Where is cell 2 and how do you know? The sheet I've made is displaying data in columns 2 through 11 ( 1 is empty) , they are also lettered A through K. I know what a cell is, so I entered the formula =IF(F2=F1,L1+1,1) in the first and last empty cells in columns 1 and 12, and that didn't do anything other than yield this : 1/1/1900. I used the 8601 lines copied from the men's 1500 at all time athletics ( it doesn't have more than 8601).

    I'm assuming, because that's all I can do for the moment, that each row consists of cells, ( created by the intersection of rows and columns), beginning directly under Column 1 (A), and on my sheet cell 2 is occupied with the rank/8601 times, in this case Abdelali Rayzn, 7564.

    I entered the formula in the 2nd cell, and all it does it create the number 1 every time.
    Last edited by trackCanuck; 11-28-2019 at 05:55 PM.
    Reply With Quote
     

  5. Collapse Details
     
    #25
    Senior Member
    Join Date
    Oct 2005
    Location
    Madison, WI
    Posts
    23,511
    One trick I use can be down with names.

    1 Sort by name
    2 Have a column where the value equals (say) 1 when they are the same and 0 else, then 'hard code this cell [If An+1 = An, 1; else 0] (copy and paste the column as 'Values')
    3 Sort by this column with a secondary sort on Time (or mark); all of the 1s in this column will be at the bottom and you will be rid of all of the duplicates.
    Reply With Quote
     

  6. Collapse Details
     
    #26
    Senior Member
    Join Date
    Jun 2008
    Posts
    3,791
    Quote Originally Posted by Master403 View Post
    Thank you. I did realize that a number of the stat guys on here routinely do this. I was posting specifically for someone who said that getting the 1000th performer out of several thousand performances would be a lot of weeding. It really isn't. The solution to the Decker/Tabb/Slaney or Kleinsasser/Caldwell/Wysocki problem or special characters in names, or typos is more difficult, but data validation is aided a lot by sorting by birthdate and aligning the names.


    Not at all. I knew that 95+% of the board would have no interest. The comment is like saying that someone who publishes a magazine on a specific topic, say, track and field, assumes that the average shlub will read it.
    We sort Excel files of Olympic results. One of the best ways to deal with the married name problem is to sort by first name, not surname. Works better for women
    Reply With Quote
     

  7. Collapse Details
     
    #27
    Member
    Join Date
    May 2009
    Posts
    84
    I see there have been a few references to the IAAF performance lists as people try to get deep lists of performers (or performances). The IAAF lists (or "worldathletics", but I haven't got used to that yet) were significantly extended early in 2018 when they incorporated the data previously available from the www.all-athletics.com website. However, it's worth noting that the all-athletics team mostly focused on data since about 1998 and their earlier data is far from complete. Hence anyone using the current IAAF lists for in-depth "all-time" analysis will find data from before the late 1990's is under-represented.

    Although the all-athletics project was very impressive in the quantity of data they compiled, the information, in my experience, has many more errors than other sources, such as Peter Larsson's excellent alltime-athletics.com or the ATFS annuals. I've asked IAAF to make a few corrections but they haven't responded to my requests (anyone remember Mark "Whiterspoon" ?)

    As a quick example, the IAAF site shows 101 performances (by 42 performers) to 20.70 in the 200m in 1980, whereas the ATFS list has 128 performances (by 43 performers).

    For what it's worth, I stopped using Excel a couple of years ago when Microsoft starting asking for too much money to use it. Instead I use WPS Office which is free to use and seems to have more spreadsheet functionality than Excel.
    Reply With Quote
     

  8. Collapse Details
     
    #28
    Senior Member
    Join Date
    Oct 2005
    Location
    Madison, WI
    Posts
    23,511
    Certainly the logic of the Excel commands and likely even the particulars will work with other programs. I still have a 'forever' license from a while back that still works. Plus I will get my wife's old work computer when it is retired and it has Office.
    Reply With Quote
     

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •