Data Management Framework

Research Analytics Services (RAS) offers the following framework for data management within the Office of the Vice President of Research and Innovation (OVPRI). This framework encompasses various types of data, including financial, sponsored projects, service center, and payroll data.

Please click a category to learn more about that specific topic.

  1. Data governance
  2. Storage
  3. Type
  4. File formats
  5. Organizing files and naming conventions
  6. Documentation
  7. Security and storage
  8. Sharing and archiving
  9. Citing data
  10. Confidentiality and ethical concerns

We encourage employees to email us with questions.


Data Governance

This section outlines the ownership and stewardship of data within the OVPRI.

Sponsored Projects Data 

This includes proposals, awards, and expenditures and is overseen by Sponsored Projects Services (SPS). Proposals and award letter data live in our institutional developed proposal system, Electronic Proposal Clearance System (EPCS). The data is entered and updated by SPS but the system is maintained and secured by Research Technology Services (RTS). Award budgets and expenditures are entered and stored in the institutions financial system, Banner which is jointly maintained by SPS and Information Services (IS). IS also manages the database that feeds the Integrated Data and Reporting (IDR), the institution-wide data warehouse used for developing reports and visualizations.

OVPRI Service Centers Data

The service centers are designed to support faculty research in multiple areas such as Aquatic Animal Care Services and UO Greenhouses. See Research Core Facilities for more information. The data on service centers include internal and external sales, expenditures, and other related information as requested by the staff. The service centers’ data is managed by Research Core Business Services (RCBS).

Research Integrity Data 

This includes human subjects research data, Conflicts of Interest (COI), Export Controls and Research Compliance Services (RCS) data on grants and other areas, which are managed by Research Integrity.

Non-grant Financial Data 

This includes audited and unaudited financial data and reports, which are managed by Business Affairs (BA). Annually, RAS supports BA with audit requirements such as Schedule of Expenditures for Federal Awards (SEFA) and other audit-related requests.

Payroll Data 

This includes faculty, staff, and student data as it relates to financial and sponsored projects data. It includes sensitive data that is controlled and managed by UO Human Resources (HR).

Data Management 

The collection, analysis, reporting, and tracking of data within the OVPRI for internal and external stakeholders is managed by RAS, while the security and storage of the OVPRI data mentioned in this document is controlled by RTS.

For additional details, refer to the data structure page.

Back to top


Storage

Most of the data the OVPRI administration works with is stored in the university’s financial system, Banner, and homegrown system Electronic Proposal Clearance System (EPCS), which we plan to migrate away from in 2025 to the Research Administrative Portal (RAP). The data in this system is entered and managed by Sponsored Project Services. The query language and relational database, SQL Server, where the information is stored and managed, is maintained by RTS, which also manages the security and other engineering processes behind these systems.

IDR is mainly used to store the structured data from Banner. RAS uses this tool to collect data, develop institutional and department-based report, and to facilitate automated reporting for data access and improved efficiency. IDR also imports and stores data from EPCS.

The many areas of research and research-related data that are also under OVPRI, mainly in its centers and institutes, are not part of the scope of this document.

Back to top


Type

The university collects and works with a vast variety of large datasets including but not limited to student data, employee or human resource-related (HR) data, financial data, research data, and grants data. Most of the data used in the OVPRI is research or grants-related, financial, and HR data. In a few instances, working with student data will be required, especially for graduate students/employees. The data will be categorized by source, format, stability, and volume.

Source

Data can be observational (e.g., survey results and sensory readings), experimental (like gene sequences), simulation (like economic models), and derived or compiled (from database).

OVPRI works mainly with derived/compiled data from existing databases like EPCS and Banner. The data is reproducible and can be time-consuming to develop. In cases like conducting surveys for the Diversity, Equity, and Inclusion Committee (DEIC), the data source is observational.

Format

The main forms of data used are numeric (e.g. counts, currency, or amounts), and text (e.g., field descriptions).

  • Numeric
    • Finance data from Banner/Cognos
    • HR data, which is sensitive data from Banner/Cognos
  • Text
    • Survey responses (DEIC)
    • Sponsor funding-related (sponsor names, project titles, names of people and units, etc.).
    • Account descriptions

Stability

The stability of the data can either be fixed, growing, or revisable.

  • Fixed datasets: When the data does not change or is not deleted after collection. Fixed datasets are rare within the OVPRI and typically only happen in cases like survey responses.
  • Growing datasets: When new data is added but the old ones don't change or get deleted. Most of our data fall into this category. We are constantly adding new data through our homegrown proposal system (EPCS) because researchers are constantly applying for grants, grants are awarded or not, expenditures are made on grants, and cost recovery from the grants is calculated and stored. The enterprise (financial and employee) data in Banner also grows.
  • Revisable datasets: When new data may be added but an old one may be changed or deleted. Due to our continuously improving methodology in entering and storing data, there are some cases in which old data that was recorded a certain way is updated to match current reporting requirements. For example, we now record pass-through federal grants as federal grants rather than by the pass-through entity.

Volume

The university stores and generates enormous amounts of data on a constant basis that cannot be stored in flat files. The datasets are recorded and generated in Banner and EPCS and are stored in packages and tables within Cognos and SQL.

System

Ensure that the data you are working with uses follows the ROCCC principles (created by data analytics staff at Google) to determine the credibility of the data. Credible the data must meet all or most of the following criteria:

  • Reliable – When the sample size is reflective of the overall population.
  • Original – The data is collected from the primary source.
  • Comprehensive – The data contains all the necessary details needed for analysis and fair conclusions.
  • Current – The data is current, within a year old.
  • Cited – The source of the data is cited.

If the data does not conform to ROCCC, when relevant, the issues must be noted in the report, email, or query to inform users of the limitations.

Back to top


File Formats

When handling data and developing reports for internal and external stakeholders, the file formats utilized are usually Microsoft Excel (.xlsx) and Microsoft Word (.docx) for static data. Dashboards use tools like Hypertext Preprocessor (PHP), Tableau, and Power BI for interactive data. In some cases, PowerPoint (.pptx) is used to share data in a presentation format and usually contains a combination of the formats above. Visualizations or charts from tools like Tableau used within a PowerPoint are usually formatted as images like .png, .jpeg, etc.

For certain databases that pull data using Python or SQL, the Comma Separated Values (.csv) format to store and transfer the data is preferred.

When working with tabular data like Excel spreadsheets where analysis may be performed, save the files in a CSV or .xlsx format. In cases where the data will be transferred across databases, software, and applications, the CSV format will be a better option as it is simpler and more compatible for those programs. CSV also better differentiates between numeric values and text. This format stores as is without manipulation unlike Excel.

Some risks to accessibility are as follows:

  • The loss of encrypted data if the key (e.g., a password) is lost. Encrypted data is discouraged. Using secure storage systems is required.
  • Changes to a CSV document cannot be saved. The changes will be lost. If changes need to be made, convert to Excel to make the changes.

For data cleaning, analysis, and reporting, Excel is a more compatible program. It also allows for multiple worksheets and charts. 

Back to top


Organizing Files and Naming Conventions

Below are guidelines for organizing and naming files to help manage data files and folders. When naming folders, it should be clear, concise, and consistent for easy identification.

Folders can be named with spaces, and with the first letter of each word capitalized like “Naming Conventions”. However, for other files like Excel documents, the OVPRI data analytics team currently uses a more specific naming convention that includes the date (YYYY-mm-dd or FYxxPxx) project name, version number with a leading zero, and the file type. Any other specification like data source (EPCS, IDR, etc. can be included as needed). Each element is separated by an underscore and grouped words or characters within an element are separated by a hyphen. Avoid special characters. Below are examples:

  • 2023-03-17_Research-Grants.xlsx
  • 2023-03-17_Research-Grants_IDR_V02.xlsx
  • FY23P06_Research-Grants.xlsx

FY stands for fiscal year, and P is the period. In a few cases, AY (academic year) is used. 

Back to top


Documentation

It is important to create clear, detailed, and up-to-date documentation for each request received and all compliance reports. The document should include:

  • The name of the author.
  • The date created.
  • The date last updated, which is relevant for periodic reports. There are usually modifications or changes in methodologies with the continuous improvement of data entry and other factors.
  • The requestor.
  • A purpose and summary of the request.
  • A step-by-step guide of the methodology and process utilized to develop the report. This includes formulas, the programs used, data validation processes, collaborators, and data sources such as EPCS and Banner.

The document is usually created using Microsoft Word. Additionally, a cover page is usually created and included within Microsoft Excel reports. It includes a summary of the data and other information like the creator, recipient or requestor, date, and description of the report content. See below for reference:

Example of data request cover page listing the creator, recipient or requestor, date, and description of the report content.
An example of document cover page, which includes the categories created by, date, intended recipient, description of the report, and data sources.

As the OVPRI data analyst, the documentation and other reports should be easily accessible to your immediate supervisor on your desktop or the shared drive.

When developing a report in Cognos, ensure you enter a brief description of the report and create a comments page that displays all the changes made. Review the “5-Year Grant Expenditures by Home Org” report in Cognos as an example. See the path below:

  • Team content > Departmental Folders > Vice President … Home Organization > Base Report as of 2023-01-11 > 5-Year Grant Expenditures by Home Org
  • Additionally, adding comments into manually created data items and filters to explain the logic is highly encouraged. This allows others who may be reviewing the code to have a good understanding of the report.

Additionally, when writing any other type of code in Excel, SQL, Python, or others, make sure to add comments where necessary, explaining the logic behind it. This allows for readability and comprehension for the creator and other users.

Back to top


Security and Storage

The university is committed to the privacy and security of data. OVPRI works with student data, staff and faculty data, payroll data, performance data (included the turnaround time of tasks performed within teams in the OVPRI) and financial data as it relates to research and grant activity (proposals, awards, expenditures, and facilities and administrative costs[F&A]).

Generally, OVPRI ensures not to disclose data that may be identifiable or confidential. In cases like survey data, where students or staff information may be identified within specific units, there is a cell size limit of under 10, meaning you cannot have responses under 10. This rule was passed down from the university’s Office of General Counsel.

Student Data

Regarding student records, the university complies with the Family Educational Rights and Privacy Act (FERPA), which is sometimes called the Buckley Amendment. According to the university’s Student Records Privacy Policy, FERPA establishes the students’ rights and the institutions’ responsibilities regarding the privacy of education records and provides guidelines for maintaining the confidentiality of the records as well as monitoring the sharing of information from those records.

In terms of reporting within OVPRI, the key rule to note is the disclosure of student data as it pertains to the release of directory and non-directory information. Directory information like students’ names, addresses, class level, graduate teaching status, etc. can be released without consent and upon request to a third party, except when release is restricted. Non-directory information like date of birth, gender, financial records, etc., can only be released with consent or because of an exception.

Most of the student-level data OVPRI reports on are aggregated and do not include identifiable information. See the UO FERPA Training or Student Records Privacy Policy for more information.

HR and Employee Data 

Data that includes employee name, date of hire, work phone, positions held, salary rates, and termination date are considered public records and can be shared internally and externally; however, other information like employee’s representative, etc., requires written consent from the employee to be shared. Social security numbers, home address, and home phone are considered confidential. Visit the Classified Employee Records and Data page for more information.

Disaggregated demographic employee data have limited availability to users. Internally, demographic data should be reported in aggregate formats and shared externally with the state upon request. Visit the Employee Demographics Information page for additional information.

Grants and Financial Data 

Some grants data is usually made public for compliance reports such as the National Science Foundation Higher Education Research and Development survey and the National Institutes of Health Biomedical Research and Development Price Index Survey. However, within OVPRI, disaggregated identifiable data sets with this information is considered confidential in the absence of a request or consent. Visit the Code of Responsibility for Security and Confidentiality of Records and Files for more information.

Data Security

  • Ensure to keep data considered confidential away from the internet including data in the form of a public dashboard.
  • Set passwords on computers and files where necessary.
  • Generally, be careful of phishing scams or emails.

Backups and Storage

  • Back up data files and folders on SharePoint or the Q drive frequently. Consult Information Services for assistance with this process or for testing.

Back to top


Sharing and Archiving

 Within OVPRI, data is shared both internally and externally. Consider the below when preparing to share data:

File Formats 

Files are stored in SharePoint, a shared desktop drive, and on personal desktop. It should be shared in a usable and accessible format and location.

Documentation 

For every report or request made, OVPRI documents the request, methodology, author, requestor, date, and other additional information. When there is an update to the report, changes are made in the document and a “Last Updated” date is included or updated to ensure completeness. See the “Documentation” section within this report for details.

Ownership and Privacy

Prior to sharing data with customers, always confirm with your supervisor about the confidentiality of the data in relation to the requestor. Some disaggregated data especially, when it includes principal investigator (PI) information and HR data, is considered confidential. Approvals for Sponsored Projects data may be required via data request forms.

  • For PI-related information that may be considered confidential, required approvals are defined below:
    • For office assistants and executive assistants within colleges requesting college, division, or unit level approvals.
    • PIs requesting data beyond their scope or units.
    • Faculty administrators requesting data beyond their scope or units.
  • Approvals can be sent via email to researchanalytics@uoregon.edu.
    • The approval can be in the form of an email from your approver to you.
  • Requestors and approvers are:
    • For office assistants, the dean of the college will request and/or approve.
    • For executive assistants, their supervisor or the dean of the college will request and/or approve.
    • For PIs, their supervisor or the dean of the college will request and/or approve.
    • For financial administrators, their supervisor will request and/or approve.
  • Approvers may also be the original requestors of the data.

To access the data access forms for HR & finance data, visit the UO Service Portal for IDR

Back to top


Citing Data

Citing data in OVPRI includes documenting the source of the data, which includes Banner, Cognos, EPCS (soon to be migrated to the Research Administrative Portal [RAP]), and SQL. Direct links to location, queries, and path description are important to note.

Links usually suffice to cite data from the web, both internal and external. Click for citation samples

Back to top


Confidentiality and Ethical Concerns

 It is important to maintain the confidentiality of data concerning researchers, staff, faculty, and students as required before publishing or sending reports. A few things to consider are:

  • Consider the extent to which your data contains direct or indirect identifiers. We must protect the identities of the researchers, faculty, staff, and students especially in cases where data is not public information.
  • Consult your supervisor, colleagues, or the Office of General Counsel for a confidentiality review, which could take the form of a Q&A session via email, Zoom, Teams, or in-person.

OVPRI staff are encouraged to sign the confidentiality agreement form.

Back to top