1
0

README 1.5 KB

123456789101112131415161718192021222324252627282930
  1. https://www.imdb.com/interfaces/
  2. Subsets of IMDb data are available for access to customers for personal and non-commercial use.
  3. You can hold local copies of this data, and it is subject to our terms and conditions.
  4. Please refer to the Non-Commercial Licensing
  5. https://help.imdb.com/article/imdb/general-information/can-i-use-imdb-data-in-my-software/G5JTRESSHJBBHTGX
  6. and copyright/license and verify compliance.
  7. https://www.imdb.com/conditions
  8. This will import the imdb dataset tsv into your mysql database for further user.
  9. Code based on the dataset at feb. 2020
  10. There will be no relations or whatsoever. Just plain data into tables.
  11. It also does not create any relation tables yet. Some tables have columns which have
  12. strings separated by comma in them.
  13. As of march 2020
  14. Title crew looks strange. The longest line is 16313 (wc -L title.crews.tsv)
  15. therefore the column directors and writers are defined as text and not
  16. varchar. Do not know if this is an error or correct...
  17. This is not a good example to be written in PHP. But you can use it.
  18. Don't execute it through a webserver. It is a CLI script
  19. # Usage
  20. Download and place the tsv files from https://www.imdb.com/interfaces/ into the datasets folder.
  21. Decide which one do you need. Alter $filesToImport in import.php to match the files.
  22. Decide if you need a full text search index. Needed if you want to use the api.php.
  23. Adding the index after the initial import is not a good idea. It takes ages!!
  24. Using the index will slow down the import. To use change BUILD_INDEX to true in import.php file