Recommendation 2 - Data Governance
Implement Organization-Wide Data Governance Structures
The second most common needs (compared to Recommendation #1) described by staff relate to concepts of data accessibility, security, validation, ownership, and documentation that comprise “Data Governance.” Data Governance refers to the overall management of the availability, usability, integrity, and security of data used in an enterprise. A sound data governance program includes a governing body or council, a defined set of procedures, and a plan to execute those procedures (Institute, n.d.).
At the highest level, this recommendation area proposes the establishment of an organizational Data Governance Committee (DGC), composed of representatives from across the AIU that meet regularly to address data-related matters escalated by staff data owners and data stewards. The DGC would ratify a charter for Data Governance operations, provide data governance training, and form workgroups as necessary to address data-related matters, document data-related procedures, and agree upon chains of data stewardship and access policies.
The sections below represent specific recommendations relevant to this focus area that arose from this Data Ecosystem Audit; these items naturally fall within the scope of a DGC’s work, and should be addressed by appropriately selected workgroups formed through the Data Governance process.
Review AIU data resources and processes impacted by PA Act 151 and ensure data protection and encryption recommendations are followed
Through PA Act 151 of 2022, the Pennsylvania General Assembly updated security provisions related to the storage and transmission of student data, including requirements for data encryption and for notification policies in the event of a breach of security (Assembly 2022). Staff in Technology Services, the Legal team, and program teams managing student data on behalf of the state of Pennsylvania recommend a thorough review of AIU data resources (application systems, databases, servers, and cloud resources) that may be affected by the provisions of Act 151 to ensure adequate security protections and emergency response procedures are in place.
Implement organization-wide information protection and classification practices
Interviews with staff members across many AIU departments revealed not only a broad collection of data management tools but also a broad spectrum of procedures for information management. This recommendation–to implement information protection and classification practices–follows from CIS Control Area 03: Data Protection Internet Security (n.d.c) and suggests the implementation of a data classification schema in commonly used data and communication platforms, such as email, word processors, spreadsheets, and SharePoint libraries. (The M365 programs AIU staff primarily use for these functions support information classification and labeling.)
An information classification schema designates the sensitivity of a given document, file, communication, or other resource. Sensitivity labels should correspond with organizational procedures concerning data access, retention period, storage, and destruction.
Adopt Standard Operating Procedures (SOPs) for the development of applications to be used within the AIU digital ecosystem
Some teams’ unique data needs and processes have led them to develop homegrown systems or contract with developers to create custom applications; several additional teams currently use a collection of spreadsheets and file shares, instead of an application system, but have distinct enough processes that they may consider tailored application development as a solution in the future.
Custom application development brings the benefit of bespoke functionality, matching the exact needs of specialized teams; however, homegrown and custom applications do not face the same regulatory and compliance standards as enterprise applications in the marketplace. This amplifies the AIU’s burden of responsibility for application security, functionality, and accessibility vetting, as these qualities cannot be assumed without testing.
To ensure all application systems and data resources are secure, reliable, accessible, and compliant with data privacy regulations, the AIU should adopt SOPs for the development (whether internal or contracted) of applications to be used within the AIU’s digital ecosystem. These should include provisions for data storage and transmission security, role-based access controls (RBAC), redundancy, authentication frameworks, third-party integrations, hosting, and related application features and functionality.
As of early 2024, the Technology Services department has recognized this recommendation as a focus area through its inclusion in a new collection of IT Controls, specifically under Control 5.9.0: Application Development Internet Security (n.d.a).
Create and maintain an inventory of AIU data assets, identifying key application systems, their data relationships, and system ownership
This recommendation sits within the scope of this Data Ecosystem Audit project and strategic goal, and has been naturally underway throughout the course of the project. In addition to following a general best practice of thorough documentation to improve awareness, institutional knowledge, and accessibility of information, cataloging AIU data assets serves to improve the organization’s security posture.
Further, an inventory of data assets, integrations, and ownership enables organizational leadership to identify opportunities to improve efficiency in system procurement–by consolidating redundant systems or investing in system integrations–and to address opportunities for improvement where gaps exist in system maturity, security, access, or data flow.
Throughout this Data Ecosystem Audit, the project team has used a relational database to capture the systems described in staff team interviews. The database is structured to indicate certain metadata related to each system, including the team who “owns” the system, its data integrations with other systems and the nature of those integrations (automated or manual upload), SSO (single sign-on) status, visibility on the company launchpad portal, and system category (e.g., Learning Management System (LMS), ad hoc spreadsheet serving as the system of record, communications platform, etc.).
The database can also associate systems with business processes, recommendations (such as those arising from this project), and granular data access details (such as the specific data points stored in a system and the parties who can view, modify, or delete each data element), though a full enumeration of these details for the >190 applications currently identified is beyond the scope of this project and would rely on extensive input and updates from the staff teams responsible for each system.
Please refer to Appendix A for the current listing of application systems identified through this project.
Provide data process management training and resources
Dedicated training sessions to teach staff (A) what data management tools exist in the AIU data ecosystem and (B) how to match use cases with the right tools would improve teams’ efficiency, autonomy, and effectiveness in addressing data challenges.
Team interviews conducted for this project revealed vast diversity of methods employed to resolve similar data problems. Consider the example of collecting data from clients served. To gather similar information, such as sign-ups, demographic info, program eligibility, etc., teams currently use…
- Paper Forms - completed by end user, then hand-entered into a database system (and/or physically filed) by AIU staff
- Distributed Spreadsheets - shared and completed by a partner entity (e.g. a school district that enters all currently eligible students on a spreadsheet shared with the AIU), then typically collated into a master spreadsheet by AIU staff
- Master Spreadsheets - single spreadsheet shared with all relevant parties, who enter their own information on a given line
- Microsoft Forms (static/“owned” by individual staff member) - Digital form in the M365 ecosystem from which an AIU staff member can export a snapshot of responses, in spreadsheet format, at any time
- Microsoft Forms (dynamic/“owned” by SharePoint group) - Digital form in the M365 ecosystem that outputs live results to shared Excel Online spreadsheet in real time.
- Survey Monkey - Enterprise survey/form design and data collection platform (licensing acquired separately through separate departments)
- AirTable - Enterprise platform for structuring, collecting, and managing data
- Google Forms - Digital form in the Google ecosystem with live results sync’d to a Google Sheet
More solutions than just these, above, exist, and, clearly, many options are employed to solve the same problem. During interviews, many staff members expressed a desire for guidance both on what resources exist for data collection and management and on how to either choose the best tool for the job or whom to ask for support. Additionally, several teams identified spreadsheet-based workflows as among the most time-consuming parts of their data operations.
Targeted staff training, divided into the 3 session offerings below, could address this need and others:
Excel-Based Data Management Workflows for Everyday Use
While each team has unique needs and reporting requirements, nearly all staff use spreadsheets for many aspects of their data operations. This session would highlight common moves to improve staff members’ efficiency and capabilities in using spreadsheets for data-related tasks, such as data cleaning; “cross-checking” data from multiple files, tables, or tabs; using lookup formulas and built-in data validation functions; and leveraging data table functionality for sorting, filtering, and formatting data in spreadsheet format. The target audience for this session would be staff with day-to-day responsibility for managing operational data.
Data Solutions beyond Spreadsheets
A spreadsheet is a versatile tool, but many teams’ data challenges call for more powerful or complex solutions. This session would aim to help participants (A) grasp the scope of data tools, platforms, and capabilities that exist, and (B) match their data use cases and scenarios with the right tools for the job. Participants should leave the session with a better sense of when a spreadsheet is or is not the best solution and of how to find the approach that is most efficient, secure, and effective for the use case at hand.
Options for Advanced Data Management and Analytics
Because significant staff time is spent in the data collection, entry, validation, and reporting phases, for many teams, there is limited remaining capacity for conducting analysis and deriving insights from program data to inform decision making and strategic priorities. This session would introduce team and program leads to a basic menu of analytics pathways that could be leveraged using existing AIU resources and Data Services staff. The session would focus on demonstrating possibilities using data dashboards and statistical analysis/machine learning approaches to answer a set of example data questions. Participants should leave with (A) an idea of the data questions and insights they could explore with a given analysis tool, and (B) an understanding of the toolkit available, relative time and effort required to deliver/receive an analytics “product”, and staff and technology resources they can leverage to start exploring their data.