[D] What’s with Linguistic Data Consortium paywall for.. everything?

What’s up with LDC pay walling every corpus up to 8,000$? Look at the surreal price of how a year subscription is worth. It is out of anybody’s league. Are there any open alternatives to speech recognition?

We have a lot of open-sourced datasets for other domains, computer vision, image processing, medical scans. Only audio datasets are being kept behind a paywall. According to another Reddit post, they had plenty of public funding, and yet nothing changed.

They have old stuff as of 1992 and nothing is released for free. Is it common for US unis to have access to this kind of material? They sound more like a corporation than a consortium. Not having an updated and big corpora hinders the whole field from discoveries, because each time there is a research in the domain, there will be time wasted building another dataset.

submitted by /u/MrBojanglesReturns
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] What’s with Linguistic Data Consortium paywall for.. everything?