Kerk Kee is developing a framework to measure, build and improve virtual organizations’ capacity.
When the concept of big data (large data sets analyzed computationally) emerged in the early 2000s, it was predominately used in the science, technology, engineering and mathematics (STEM) fields. Now, big data sets are used in almost every discipline, from STEM to business, social sciences, humanities and more.
Cyberinfrastructure (CI), a platform of multiple technologies that can enable and support the computational processing of big data, is needed to gather and analyze the data and fully digest all the information it contains, especially as the users of big data shift away from solely STEM areas. Today, more multidisciplinary groups are forming to carry out CI projects, creating new problems that have less to do with the data being analyzed.
To help solve this, Texas Tech University's Kerk Kee, received a $519,753 CAREER grant from the National Science Foundation (NSF) to study the diffusion of CI and how to improve organizational capacity in CI projects, particularly in virtual organizations involving multi-disciplinary and multi-institutional collaborators. This capacity is what it takes for collaborators to carry out work on a daily basis toward long-term goals, and all the resources necessary to make it happen with personnel, datasets and technologies spread across multiple institutions. The grant was initially awarded to Kee when he was an assistant professor at Chapman University, but transferred to Texas Tech when he joined the College of Media & Communication as an associate professor.
"When we use the term 'cyberinfrastructure projects,' that usually refers to an academic or research project that has a component of using big data to investigate some questions at the forefront of the discipline," Kee said. "The idea is that, in the past, before big data, most research projects focused on smaller data sets, often locally. Now, with cyberinfrastructure, we can aggregate all these data sets across locations together or collect larger-scale data sets so we can process it to investigate grand challenges in research."
The increasing popularity of big data has led to more disciplines outside of STEM incorporating it into research projects because of the numerous insights that can be uncovered. However, Kee said without the appropriate technologies to process and experts to interpret the analysis, those insights remain in the raw data.
If this data is appropriately integrated into more projects, researchers can accomplish more scientific breakthroughs and bigger discoveries. That's where CI comes in. But learning how to use CI and big data involves incorporating a range of new behaviors, something few people can do because they don't have the skill set for it.
"If an artist wants to pull data from the web and do some pretty visualization, that's interesting work," Kee said. "But not every artist is trained to program. So it requires a mixture of the adoption of the object, the adoption of the behavior and the adoption of the ideology."
Because big data is a relatively new phenomenon, Kee said there aren't enough technologies out there to analyze it effectively and efficiently. Most CI projects focus on collecting and processing the data, but they also need to focus on developing technology in the process. This poses a problem for others wanting to use big data, because few people know how to perform every aspect of it themselves.
That's when organizational capacity and capacity building for CI projects is needed.
Kee said the CI movement is STEM-oriented, and while most project leaders recognize that these projects are multi-disciplinary and multi-institutional, they were not trained to deal with the social and organizational aspects of these projects.
"People talked about how they recognize that projects sometimes fail, not because they don't have the technical expertise," Kee said. "It's that they are unable to organize successfully, then they create a lot of delays and problems with these projects."
After collecting and analyzing his interviews and surveys, Kee said he is hoping to develop a model that identifies all the key aspects for a successful CI project from an organizational and social standpoint. This includes the group's ability to recognize problems and adapt quickly, deal with norms for each discipline involved with the project and work with a diverse group of people.
"If everyone is from the same discipline and the same institution, it's much easier to coordinate," Kee said. "When you have people from very different disciplinary backgrounds and geographic locations, they may have very different perspectives, motivations, goals, schedules and commitments, which can be a struggle."
To overcome this, Kee wants to identify a list of factors that lead to successful CI projects and organizational capacity and use that list to create a questionnaire groups can use to pinpoint potential strengths and weaknesses.
Before a research group, or virtual organization of multi-institutional collaborators, starts its project, it goes to a website and fills out the questionnaire. Once it is submitted, the group will receive a score. This number is broken into sub-scores for different dimensions to give the group a better idea of where they may be lacking, compared to what they are already strong in. Kee also wants to give groups capacity-building strategies so they can improve their lower scores.
Kee equates the score to a credit score. While it's regarded in the financial industry as the ultimate score of your credit worthiness, you can still make changes in your money management on a daily basis to improve your likelihood to pay back the loan on time. In the long run, your new money management decisions today will improve your future credit score. That's the purpose behind this algorithm and score.
"The idea is that we give you an assessment of where you are right now, but also here's something you can do to increase your likelihood of success from an organizational standpoint," Kee said. "That way, once you put your research group together and you launch a CI project, you're more prepared to succeed."
After the online framework is ready to roll out, Kee also is considering developing a training program for groups to participate in. This would be a program customized to the CI context so these predominantly STEM professionals can learn capacity-building strategies.
While the focus of the project is on CI in the STEM fields, Kee said the results likely could be applied to a wide range of disciplines.
"I want to be able to create a website with an instrument, an algorithm and a complex framework that would help either seasoned cyberinfrastructure projects, or first-time cyberinfrastructure projects, to be more strategic in the way they set up and execute their project," Kee said.