Is Academic Research Computing an Industry?
A major theme this monograph is application of business concepts to the subject area of a professional organization called CASC. CASC is the Coalition for Academic Scientific Computation. CASC has 75 organizational members, each of which is some sort of center, department or other academic unit that provides services and support for research computing.
This monograph, "The Business of Energy in U.S. Academic Research Computing," is the major deliverable for my nine-month practicum study as part of my MBA in Sustainable Systems from the Bainbridge Graduate Institute (BGI). As the title implies, the domain of interest is CASC members and similar academic centers, particularly in the US. Much of what is discussed should be of interest outside the US, as well. The project includes emphasis on the social purpose of these centers, employee development, and approaches to greater energy efficiency for the computers and other equipment operated by the centers. Key resources and a stakeholder analysis are provided, in the tradition of other business analyses.
Is academic research computing appropriate for consideration as an industry, distinct from the industry of higher education? From a commercial perspective, information technology is defined as an industry, with high performance computing (HPC, also known as supercomputing) a specific subset of that industry. Similarly, higher education is an industry, made up of different types of colleges and universities (over 3,000 in the U.S.). I believe that academic research computing is appropriately and accurately treated as its own industry, and will proceed in this monograph on that basis. By identifying academic research computing as an industry, we will be more easily able to apply the tools and language of business to that industry. This will, I hope, benefit the industry as we make the case for investment and utilization of our resources.
Academic Research Computing as an Industry
Industry may be defined as the production of an economic good or service within an economy. Do research computing units provide a good or service? Indeed, they do.
A research computing unit might be called a supercomputing center, or a research technologies center, or it might be part of a larger unit such as a college or an academic computing center. Units have anywhere from one or two dedicated personnel, up to over a hundred. They might operate a small computational cluster worth under $20,000, or might operate one of the world's largest computer systems, worth over $100M.
The goods are the hours of computation (i.e., CPU hours), the large scale storage, high speed networks, and other tangibles. They are also output products such as papers and visualizations, which might be co-created by center staff with the users (who we can, for current purposes, think of as the main customer).
The services include the operation of the computing, storage and networking equipment, which is rather specialized and takes place in purpose-built data centers. In some centers, these aspects are outsourced to cloud providers, to other campuses, or to commercial industry - sometimes a combination. The other main service is providing guidance on how to best utilize the goods: the transformation of raw data, or scientific questions, or results of simulations, into more refined data, into answers, and into knowledge.
In smaller centers, the service might focus on keeping systems running, and helping with basic usage problems or diagnosing problems. The largest centers provide in-depth training and support, and also develop new software or techniques to add value to the output of their goods. Most centers perform a mix of these activities. The specific mix is dependent on how large the center staff is, the extent and sophistication of the user base, and their scientific areas of interest.
Research computing units provide goods and services. In my past position at ARSC, I performed lifecycle analysis of the equipment for computing and storage, along with the necessary personnel to run them and support their users. This yielded in a cost per CPU hour for computation, and for storage (expressed as terabytes per year). Other centers have done similar valuations, and so have cloud providers. Like other industries (for example, automobiles), there is a wide range, but for comparable services pricing is approximately within an order of magnitude (i.e., the most expensive is probably no more than 10X the least expensive, and probably closer in price than that).
Research computing provides goods and services, and we can assert market values for those goods and services. Many aspects of the value are unambiguous: the cost of electricity to power the equipment, the salaries of the people to service and support it, and the cost of the equipment itself (as well as added costs for software, technical support, and maintenance). Other aspects are more subjective, such as the needed turnaround time for a job to be completed, the incremental costs of providing specialized support, and technology refresh schedules.
It is my belief that research computing is fairly unique within higher education, in that the somewhat specialized goods and services are mainly found only in higher education in the US, with few direct analogues in the private sector. This topic will be unpacked in more detail in the industry analysis. For now, consider that many colleges and universities provide services such as housing, dining, printing, and computer repair. Yet, these are also offered by private industry, and sometimes a college might partner with private industry to offer the services.
In research computing, there are relatively few private offerings. This is changing as additional cloud-type services become available, and seek to mainstream and commodify supercomputing and large-scale storage. This evolution is a major threat to research computing centers that will be addressed herein. Today, however, it is an industry that exists mainly in campus-based units.
What is the economy in which the units exist? Like most economies, they are nested: customers are often local, at the same institution as the unit - mainly these are professors and students. Professors might have research grants that contribute to the economic well-being of a center. Variation in economics of centers is considerable, and will be addressed. Suffice to say, there are local economies at institutions of higher education. These exist within communities, and are often part of the state economic engine (especially for public institutions). They make large purchases of computers and other equipment. They utilize lots of electricity. They are also players nationally, for grants and contracts from a variety of sources. Yes, they are certainly operating within economies, and have impact on those economies.
Thus, based on the definition above, we see that research computing may be thought of as an industry in its own right, despite operating as part of at least two other industries: HPC and higher education.
The High Performance Computing Industry
HPC is recognized as a global industry, and generally viewed as a subset of the information technology (IT) industry. IT includes telecom (which, in turn, includes cellular communication and devices), all types of computers, and services. The 2013 IDC industry report (http://www.idc.com/downloads/idc_at_sc13_%2011-19-2013_final.pdf) tracks these industry components:
- Client devices
- Telecom equipment
Equipment for HPC (that is, supercomputers, large-scale storage, and high-end networking equipment found in supercomputing centers) are mostly grouped by IDC in servers. Revenue for HPC in 2012 exceeded $11B (US), and remains on a modest growth curve of 2-4%, with some regions exhibiting stronger growth than others. This $11B figure is for servers only. A further $10B is recognized for storage ($3.7B), software ($3.4B), services ($1.8B) and middleware ($1.1B).
The report highlighted disruptive trends in HPC, including the slow growth of cloud computing, incorporation of accelerator technology, challenges of power and cooling, software and software usability, new emphasis on big data, and the many unknowns as the industry marches on from petascale to exascale-level systems. These disruptions are important for academic research computing as well, and are mentioned elsewhere in this monograph.
Thirteen separate industries make up the $11B HPC revenues from 2012 (slide 20). "University/academic" was $2.05B in 2012, with a 5.9% projected growth through 2017. Worldwide, this represents 18.5% of the HPC industry as identified by IDC. Below, we will see what that means in terms of research revenues to the universities that operate HPC facilities.
Return on Investment
The IDC introduced a new model for Return on Investment for high performance computing, during the fall 2013 IEEE/ACM Supercomputing meeting in Denver. The overall approach is still under development, and has already demonstrated some interesting findings. Emphasis is on research-generated outcomes, across industries. The model and report are available at http://www.hpcuserforum.com/ROI/. Highlights on ROI announced to date (slide 58):
- $356 on average in revenue per dollar of HPC invested.
- $38 on average of profits (or cost savings) per dollar of HPC invested.
For the University/Academic segment, the survey identified $37.4M in revenue, and $70.8M in profit, across only 12 centers, which was the sample size for the educational center. The full data set, as a spreadsheet, is available at the same Web address. In this monograph, we will describe a similar approach to ROI for academic research computing centers, but with emphasis on academic outcomes, not just revenues.
CASC Perspective on the Industry
The section on Organizational Culture goes into detail on how CASC is, among other things, an important culture keeper for the industry. CASC serves to bring together key stakeholders from within the industry, for conversations with each other and with stakeholders from the government, from scientific communities, and from the computing industry. Most recently, CASC has engaged in a self-study of its membership, discussed further in the Organizational Culture section.
More than any other organization, CASC strives to be inclusive of all types and sizes of research computing centers in the US.
"Small HPC" BoF Perspective on Industry
Since 2009, a Birds of a Feather (BoF) session has been hosted at the annual ACM/IEEE Supercomputing conference. This BoF has given members of the industry, who are mostly from CASC institutions, opportunity to share their experiences. It's no secret that there are relatively few very large academic research computing centers. The XSEDE centers, for example, have in many cases seen decades of growth. The distribution of research computing center sizes has a greater number of medium and small centers than large, whether measured in number of staff members, size of computational resources, annual budget, or outcomes such as scientific productivity and innovation. At the smallest end are departments or individual researchers that are seeking to grow a broader mission. The vast middle consists of centers that have anywhere from a fractional full-time equivalent person up to, perhaps, a dozen or so who are devoted to supporting the users and systems that are at the heart of the center.
The BoF has had an annual survey that gives substance and definition to the vast middle. Surveys from 2011-2013 are available at the Small HPC BoF site: https://sites.google.com/site/smallhpc/files-and-documents. Outcomes include counts of personnel, as well as systems, the user base, and software support.
CASC, as an organization, is relatively mature, and small enough for multiple discussions, multiple missions, and different values. The member organizations, though, have a wide range of size, age, maturity and emphasis. The personnel within them, as well as the organizations themselves, can be at different phases in the organizational lifecycle.
- Entrepreneurial/startup: Like any small business or other startup, the founder(s) is(are) likely to be deeply engaged in course-setting for the organization, and of communicating, and sometimes adapting, the purpose of the organization. Tolerance for rapid change is necessary.
- Infant: Learning to walk. Beyond the startup phase, such an organization is seeking to find its way. Activities include growing staff, procedures, and becoming increasingly comfortable with mission, values, vision, and goals.
- Stable: Established mechanisms for getting work done, with relatively few inclinations for change. The founder might still be with the organization, but is likely to be less concerned with change than with stability.
- Mature: associated with bureaucracies, these organizations might have more rigid procedures. They are often larger. In research computing, government labs have some of these characteristics. Smaller organizations might have such traits, especially if their host organization tends towards bureaucracy or autocracy.
None of these lifecycle phases (after Sanford, 2011) dictate greater or lesser levels of functionality, or necessarily yield greater success. It is important for managers within them to recognize the relative tolerance for ambiguity, entrepreneurialship, caution, procedure, etc. Alignment within the research computing organization, of the organization with its parent institution, and with the users and other stakeholders, will help keep things running smoothly. Alignment of personal values of staff (such as for tolerance for ambiguity, or for rule enforcing, or for deliberate change management, etc.) is likely to be critical for staff happiness and advancement.