Hello,
During the "good guy"/"bad guy" list debacle, I was made aware that some were interested in a cleaned up version of the logs.tf dataset. I wrote a script to port the data to a simpler and more legible schema using sqlite3, then added the ability to update with fresh data from the API. It depends only on Python 3, which should be a common tool for data scientists, there are no external libraries required, for ease of use. The schema can be read at the beginning of the script.
https://github.com/ldesgoui/clone_logs
Clones of the first 2378000 logs processed in 100k chunks, as well as a csv dump only containing chat logs up until April 2019, can be found at: https://mega.nz/#F!l9oGiKCb!lTWT2RSkTYv-TJZb92_ksA (You don't need the script to make use of them, they're just sqlite3 databases)
Bests,
Computer nerd