Analysing your books in Goodreads Part 1 – Getting the data

I’ve been running an analysis of the books I’ve read for the last three years, and have only started gathering data on my reading habits from 2015 using Excel at first, and then Goodreads.

When I first started using Goodreads I added books but did not populate the “Date Read” field. This meant that I could not reliably use the Goodreads data to analyse my reading habits. Also, at that point, I had exclusively been buying the bulk of my books from Apple’s iBooks store. Goodreads doesn’t currently have a way to link to iBooks, so I had to figure out a way to pull my data out of the iBooks database on my Mac instead (it was an interesting process that I’ll cover in a separate post).

So to be able to analyse my data, I had to then make sure the relevant data fields were updated, and I did it using the below methods.

Before you can analyse the data:

If you’re keen to analyse your own reading statistics via Goodreads, make sure that the “Date Read” field on your listed books is filled in first.

Just to make it all the more confusing, this field is named “Date Read” on your main My Books page but is named “Date Finished” on the individual book page.

This year, I finally found some time to go through Goodreads and update the “Date Read” field for books in 2016 and 2017, and tada! Apparently, I’ve read this much in that time. This has then updated my Good Reads “My Stats” page.

Quick tip: If you get into the habit of rating a book as soon as you’ve finished reading it, Goodreads will automatically populate the “Date Read” field.

If the field hasn’t been populated, you can do so individually on each book. However, if you’re a voracious reader, you will probably appreciate being able to do this in bulk as well. The next two sections will cover how you can do this.

 

Exporting your Goodreads data:

Before you can import your data in bulk, you have to get it out to quickly check which books have blank fields that need updating.

  1. Log into your Goodreads.com account.

2. Go to My Books

3. On the left sidebar, go to Tools, then select Import and export.

4. You will be taken to the page My Books > Import. On the right sidebar, find EXPORT YOUR BOOKS, and select Export Library.

5. You’ll see the following message. You can navigate away from this page at this point if you wish, but it generally doesn’t take long to generate the file.

6. Once your export file is generated, the message will change. Click on Your export to download the CSV file.

Importing your Goodreads data:

  1. Make a second copy of the file you’ve just exported in the above section. This is so you will always have a backup of the original export to do a full restore in case anything goes wrong. Backups are good. Backups are your friend.

2. You can now update the “Date Read” column in the file.

3. Go back to My BooksTools, then select Import and export. This time, use the Import from a File section to upload the file you’ve just updated.

4. Just bear in mind that this will overwrite all of your Goodreads data. If you’d rather be safe, delete the columns you’re not updating prior to uploading the file. For example, this is what my final file looked like prior to import:

5. As the upload is occurring, you may see error messages pop up. Turns out that books are matched in the upload using the book’s ISBN. If you’ve added the books from Amazon or added the e-book version, there won’t be an ISBN. Take a note of which books weren’t updated, and then update them manually.

 

Checking out your updated data:

Once that’s done, you can now can see your data on Goodreads “My Stats” page! You can access it here or here:

This is the most interesting part and makes all that data entry worth it!

I’ll dig into my own data in Part 2 of this series (coming soon).

Verification and Testing:

To verify this data, I performed a full export of my Goodreads library using their import export function as described above.

What I found was that the visual chart in Goodreads is currently incorrect. I verified the counts by comparing the bar chart count for 2017 to the Goodreads CSV export count, to the Goodreads bar chart detail count. Results are as follows for 2017:

  • Goodreads bar chart: lists 31 books
  • Goodreads detail: Lists 28 read books and 2 unfinished books
  • Goodreads CSV Export file: Lists 28 read books and 2 unfinished books

In Goodreads, books can be marked as “Did Not Finish”, and this ends up applying a value to the “Date Read” column. However, the “Date Read” column is what’s being used by Goodreads stats to flag a book as being read, which is technically incorrect if the book was not finished. Therefore, when counting books, books shelved to “Did not finish” need to be manually excluded.

Therefore, it can be concluded that the Goodreads detail and CSV export is more accurate. However, you don’t want to be manually counting your books at the detail level, especially if you’re a voracious reader.

That that point, it’s much easier to convert your CSV into an Excel Workbook (by saving it as .XLXS), and then running your analysis from there using pivot tables!

If you do go down that path, make sure that:

  1. Books ID is used as the column to identify distinct books
  2. Unfinished books are to be excluded from the count

For purposes the next post in this series, I’ll be primarily using the data from the CSV export in Excel.


Other posts in series:

  • Analysing your books in Goodreads Part 1 – Poking into my own reading data (coming soon)
  • Bonus section for those purchasing e-book – getting data out of your iBooks app (coming soon)

Leave a Reply

Your email address will not be published. Required fields are marked *