HomeResourcesPublication: Copyright, data mining and developing models for South African natural language processing

Publication: Copyright, data mining and developing models for South African natural language processing

This paper written by Dr Chijioke Okorie as part of the Program on intellectual property and Information Justice’s right to research in international copyright law project, sets out the issues of copyright ownership and risk of copyright infringement liability raised by data science research use of data held by public bodies (in particular, public service broadcasters) in South Africa. Considering both the fair dealing exception in South Africa’s Copyright Act of 1978 and the proposed fair use provision in its Copyright Amendment Bill B13F-2017, the paper discusses these issues elaborating on the reasons why data science researchers in public research institutions should not require a copyright licence or be considered to be infringing copyright when they use copyright-protected materials held by public bodies for data science and artificial intelligence or machine learning research (henceforth, data science research). The paper also suggests that even where the outcomes/outputs of data science research are copyright-protected, they should be made available in an open and accessible manner with reasonable safeguards.

The paper is available open access.

Related Resources