UM  > Faculty of Education
Affiliated with RCfalse
Status已發表Published
Trails of Data: Three Cases for Collecting Web Information for Social Science Research
Li, F.; Zhou, Y.; Cai, T.
2019-11-01
Source PublicationSocial Science Computer Review
ISSN1552-8286
Pages1-21
Abstract

As the availability of online data grows rapidly, researchers are confronted with a pressing question: How should social scientists collect Internet data for research? This study focuses on one of the most commonly used data collection techniques: web scraping. Going beyond canned approaches by leveraging a general framework of data communication, this study illustrates how online information can be systematically queried and fetched for reproducible research. To generalize our approaches, we additionally explore the variations in site security and architecture that analysts may encounter during the scraping process before they are given access to the desired data. The approaches we introduce do not rely on any proprietary software and can be easily implemented on any computing platform with programming languages such as Python or R. The methodological discussion in this study is meant to be applicable to current web-based research efforts. We include three examples with complete Python implementation. We also present an integrated workflow that enables researchers to produce analytical data sets that are traceable and thus verifiable for analysis or replication. Lastly, options related to the validity and efficiency of data are discussed, and we highlight the ongoing debate surrounding the ethics of online data collection, ultimately advocating for the fair use of online data.

KeywordData Collection Reproducible Research Web Scraping Headless Browser Apis Python
URLView the original
Language英語English
The Source to ArticlePB_Publication
Document TypeJournal article
CollectionFaculty of Education
Corresponding AuthorCai, T.
Recommended Citation
GB/T 7714
Li, F.,Zhou, Y.,Cai, T.. Trails of Data: Three Cases for Collecting Web Information for Social Science Research[J]. Social Science Computer Review,2019:1-21.
APA Li, F.,Zhou, Y.,&Cai, T..(2019).Trails of Data: Three Cases for Collecting Web Information for Social Science Research.Social Science Computer Review,1-21.
MLA Li, F.,et al."Trails of Data: Three Cases for Collecting Web Information for Social Science Research".Social Science Computer Review (2019):1-21.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Li, F.]'s Articles
[Zhou, Y.]'s Articles
[Cai, T.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Li, F.]'s Articles
[Zhou, Y.]'s Articles
[Cai, T.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Li, F.]'s Articles
[Zhou, Y.]'s Articles
[Cai, T.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.