9:00am - 9:15amID: 147
/ PS-18: 1
Topics: Data Science; Analytics; and VisualizationKeywords: Data Science; Competency Framework; Human-Centered Approaches; Employment Analysis
An Analysis on Competency of Human-centered Data Science Employment
Wuhan University, People's Republic of China
The rapid rise of data science has brought about the problem of talent gaps and concerns about employment competency development. This paper performed a study on the analysis of data science employment market in the information science context with open data from online recruitment website. In addition to basic qualitative analysis and descriptive statistical summarization of advertisement characteristics, it mainly established a competency framework of data science workforce with the method of content analysis. The objective is through market needs to provide guidance for institutions planning for or revising a major in data science, help bridge the gap between the high demand and low supply of data scientists and enable existing data science teams to operate more efficiently. Our working results indicate that the market is looking for ways in which humans can integrate their roles into data science, and these methods depend on the spread of how to use human-centered data science and its benefits, the operating mechanism of which is explained by our proposed model.
9:15am - 9:30amID: 160
/ PS-18: 2
Topics: Privacy and EthicsKeywords: Ethics of Artificial Intelligence (AI); Social factors in Data Science; Knowledge Infrastructures; Value Sensitive Design (VSD); Social Informatics
Good Systems, Bad Data? Interpretations of AI Hype and Failures
1School of Information, University of Texas at Austin, USA; 2LBJ School of Public Affairs, University of Texas at Austin, USA; 3Cisco Systems, USA
Artificial intelligence (AI), including machine learning (ML), is widely viewed as having substantial transformative potential across society, and novel implementations of these technologies promise new modes of living, working, and community engagement. Data and the algorithms that operate upon it thus operate under an expansive ethical valence, bearing consequence to both the development of these potentially transformative technologies and our understanding of how best to manage and support its impact. This research reports upon an interview-driven study of professional stakeholders engaged with technology development, policy, and law relating to AI. Among our studied experts, unexpected outcomes and flawed implementations of AI, especially those leading to negative social consequences, are often attributed to ill-structured, incomplete, or biased data, and the algorithms and interpretations that might produce negative social consequence are seen as neutrally representing the data, or otherwise blameless in that consequence. We propose a more complex infrastructural view of the tools, data, and operation of AI systems as necessary to the production of social good, and explore how representations of the successes and failures of these systems, even among experts, tend to valorize algorithmic analysis and locate fault at the quality of the data rather than the implementation of systems.
9:30am - 9:40amID: 252
/ PS-18: 3
Topics: Data Science; Analytics; and VisualizationKeywords: Relation extraction, Information extraction, Scholarly text mining, Knowledge graphs
Targeting Precision: A Hybrid Scientific Relation Extraction Pipeline for Improved Scholarly Knowledge Organization
1University of Illinois at Urbana and Champaign, USA; 2TIB Leibniz Information Centre for Science and Technology, L3S Research Center at Leibniz University of Hannover, Germany
Knowledge graphs have been successfully built from unstructured texts in general domains such as newswire by leveraging distant supervision relation signals from linked data repositories such as DBpedia. In contrast, the lack of a comprehensive ontology of scholarly relations makes it difficult to similarly adopt distant supervision to create knowledge graphs over scholarly articles. In light of this difficulty, we propose a hybrid approach to extract scientific concept relations from scholarly publications by: 1) utilizing syntactic rules as a form of distant supervision to link related scientific term pairs; and 2) training a classifier to further identify the relation type per pair. Our system targets a high-precision performance objective as opposed to high recall, aiming to reduce the noisy results albeit at the cost of extracting fewer relations when building scholarly knowledge graphs over massive-scale publications. Results on two benchmark datasets show that our hybrid system surpasses the state-of-the-art with an overall 60% F1 score led by the nearly 15% precision boost in identifying related scientific concepts. We further achieved an overall F1 in the range 34.1% to 51.2%, on relation classification, per experimental dataset.
9:40am - 9:50amID: 328
/ PS-18: 4
Topics: Privacy and EthicsKeywords: data science ethics, serious games, diversity
Advancing Diversity in Human Centered Data Science Education Through Games
1University of Washington, USA; 2University of North Texas, USA; 3Nelson Institute, University of Wisconsin-Madison, USA
Educational games, particularly those that encourage collaboration with peers and focusing on social and ethical issues, may be powerful in improving retention of human computer interaction (HCI) and human centered data science (HCDS) concepts among young people by providing strong emotional experiences. Further, games have the potential of reaching a wider and more diverse population than formal education. We draw upon prior experience with improving diversity in science, technology, engineering, and mathematics (STEM) as well as experience building and deploying HCDS games to suggest novel uses of gaming to increase the retention of concepts in HCI and data science among diverse learners.
9:50am - 10:00amID: 361
/ PS-18: 5
Topics: Data Science; Analytics; and VisualizationKeywords: Analysis Framework; Open Academic Graph; Research Frontier Analysis; Subject Topic Evolution Analysis
An Analysis Framework of Research Frontiers Based on The Large-scale Open Academic Graph
1University of Texas at Austin, USA; 2Wuhan University, People's Republic of China
[Purpose] As a high-quality and well-structured dataset, the large-scale open academic graph formed under the influence of the open science movement has created new research conditions for research frontier analysis. Constructing the analysis framework of research frontiers based on the large-scale open academic graph can effectively promote the realization of data-driven knowledge discovery and the analysis and decision-making of sci-tech intelligence. [Approach] The definition and analysis methods of research frontiers were summarized through related studies, and the data structure of the specific open academic graph was investigated. [Findings] The thoughts and steps of research frontier analysis based on the open academic graph were put forward, and an available analysis framework of research frontiers based on the large-scale open academic graph was constructed. [Value] The proposed framework can achieve deep, relevant and dynamic analysis of research frontiers in various disciplines based on the emerging large-scale open academic graph. It will provide a novel perspective for performing dynamic analysis across time and space, multidimensional analysis under multiple factors, and multiscale evolution analysis of research frontiers.