IT and Applications
Unit 5: Database and Business Intelligence
Databases, data concepts, file system vs database, data models, DBMS, business advantages of databases, and business intelligence — data warehouse and data mining.
Introduction
Modern organisations generate huge amounts of data every day — transactions, customer records, sensor readings, web logs. A database is the structured way to store this data so it can be retrieved, updated, and analysed reliably.
Business Intelligence (BI) is the next step: turning that stored data into insights that drive decisions.
Database
A database is an organised collection of related data. It is designed to:
- Store large amounts of data efficiently.
- Provide fast and reliable access.
- Maintain integrity (no contradictions, no duplicates).
- Allow many users to access it simultaneously.
Examples: a school’s student database, a bank’s transaction database, an e-commerce site’s product database.
Data concepts and characteristics
Key concepts:
- Data — raw facts (e.g.
25,"Ram"). - Information — processed data with meaning (e.g.
Ram is 25 years old). - Record — a collection of related fields (one row in a table).
- Field — a single piece of data (one column in a table).
- Entity — an object the data describes (Student, Product).
- Attribute — a property of an entity (Name, Age, Price).
Characteristics of good data:
- Accurate, complete, consistent, timely, relevant, and accessible.
Database vs file system
Before databases, data was stored in flat files. Both still exist, but databases solve many of the file system’s problems.
| Feature | File system | Database |
|---|---|---|
| Data redundancy | High (same data in many files) | Low (centralised storage) |
| Data consistency | Hard to keep consistent | Enforced by the DBMS |
| Security | Limited | Strong (user accounts, roles) |
| Concurrent access | Difficult | Built-in |
| Query power | Limited (read whole file) | Powerful (SQL) |
| Backup / recovery | Manual | Automated |
Database models
The way data is structured inside a database is called its model.
- Hierarchical model — tree-like, parent-child (old IBM IMS).
- Network model — graph of records with multiple parents.
- Relational model — tables with rows and columns; most widely used today (MySQL, PostgreSQL, Oracle, SQL Server).
- Object-oriented model — stores data as objects, like in programming.
- NoSQL models — modern, scalable models for big data and the web:
- Document — MongoDB, Firestore.
- Key-value — Redis, DynamoDB.
- Column-family — Cassandra, HBase.
- Graph — Neo4j, Amazon Neptune.
Database Management System (DBMS)
A DBMS is the software that lets you define, create, query, and manage a database.
Functions of a DBMS
- Define the structure of the database.
- Insert, update, delete, and retrieve data.
- Enforce data integrity and validation rules.
- Provide security and user access control.
- Handle transactions (ACID: Atomicity, Consistency, Isolation, Durability).
- Back up and recover data.
Popular DBMS examples
- Relational: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, MariaDB.
- NoSQL: MongoDB, Cassandra, Redis, Couchbase.
- Cloud-native: Amazon RDS, Google Cloud SQL, Azure SQL, Snowflake.
Example — SQL query
SELECT name, marks
FROM students
WHERE marks > 80
ORDER BY marks DESC;
This single query asks the DBMS to fetch all students with marks above 80, sorted from highest to lowest. Doing the same with flat files would take many lines of code.
Business-level advantages of databases
- Better decisions — managers have accurate, up-to-date data.
- Reduced cost — single source of truth instead of duplicated files.
- Better customer service — fast retrieval of customer info.
- Compliance — regulators require structured, auditable data.
- Scalability — modern databases handle millions of users (Facebook, Amazon).
- Integration — many applications share the same database.
Business intelligence
Business Intelligence (BI) is the process of analysing business data to support decision making.
Tools: Power BI, Tableau, Google Looker, Qlik.
Data warehouse
A data warehouse is a central repository that collects data from many sources (sales, marketing, finance, etc.) and stores it in a way optimised for analysis (not daily transactions).
Properties of a data warehouse (Inmon):
- Subject-oriented — organised around topics (customers, products).
- Integrated — data from many systems combined.
- Time-variant — keeps historical data.
- Non-volatile — data is added, not modified.
Examples of data warehouse platforms: Snowflake, Google BigQuery, Amazon Redshift, Azure Synapse.
Data mining
Data mining is the process of discovering patterns and insights in large datasets, often using statistical and machine-learning techniques.
Common tasks:
- Classification — predicting categories (spam vs not spam).
- Clustering — grouping similar customers.
- Association — “people who buy X also buy Y” (market basket analysis).
- Anomaly detection — fraud detection.
Database applications
Databases power almost every modern service:
- Banking — transactions, loans, account balances.
- E-commerce — products, orders, cart, reviews.
- Healthcare — patient records, prescriptions.
- Education — student records, online learning platforms.
- Telecom — billing, call detail records.
- Social media — posts, likes, friends, messages.
- Government — citizen ID, taxes, voter rolls.