Car Rental Version 2.0
Changes in Car Rental 2.0 vs 1.0
Optimizing Search Queries with INT
Using the INT data type for processing all search queries in the back-end significantly improves the speed of SQL queries compared to using STRING. Numeric values are more efficient for indexing and comparison operations, resulting in faster search results, particularly for large datasets.Enhancing Date Management
For managing rental periods (rented-to and rented-from), I implemented the DATE class. In version 1.0, dates were split into three INT categories (year, month, and day) and concatenated for use, which was less efficient. Switching to the DATE class streamlined the process and made date handling more robust and intuitive.Removing the "Remove" Function
The "remove" function for employees and other entities was eliminated due to complications arising from references in related tables. Instead, requests for data deletion are directed to back-end programmers, ensuring that the data is cleaned up properly without causing issues in database integrity.Introducing an Active/Inactive Status for Vehicles
To track vehicle availability, I added an "active/inactive" column. This allows the system to indicate whether a vehicle is available, under maintenance, or retired. It provides a clear and efficient way to manage the fleet's operational status.
Software Information
MySQL (32 bit), ver. 8.0.39
.NET Framework 4.8.1
Framework UI: Window Forms
Designed For
Educational purposes
Open to use for free
For this project, I chose to use numbers instead of strings in the database to optimize the back-end for search queries, especially when working with large, cloud-hosted databases. Using numbers reduces file sizes, which minimizes packet loss by requiring fewer packets to be transmitted during operations.
Additionally, the conversion of numbers into human-readable formats is delegated to individual PCs. This approach distributes the workload across multiple machines, reducing the processing burden on the central server system and improving overall efficiency.
CHAR(n) reserves a set amount of bytes equal to n, regardless if its all used or not.
CHAR(n) will pad the rest of the unused spaces in the STRING.
Faster for query, data can be pulled straight from the columns.
Better for speed.
Research Findings for Efficiency
VARCHAR(n) reserves only what we need, byte equal to length of string inputted.
VARCHAR(n) will store an additional 1 byte for the length of the string, 2 if char is over 255
Slower for query, data manipulation and index look ups is needed because length varies.
Better for storage space.
vs
CHAR(n)
VARCHAR(n)
INT
INT reserves 4 bytes.
Results in the fastest query.
Limited use given it can only be numbers
*We will use UNSIGNED given we have no need for negative numbers
*We will use UNSIGNED given we have no need for negative numbers
Using BIT, TINYINT, SMALLINT, INT to replace most CHAR to store information.
Querying will remain efficient for databases as they get bigger.
The list of cars we get after being filtered through querying can be processed on the local computing power to change INT to STRING into human readable form.
Processing a list filtered to a branch for lag is not a concern of mine, as a local branch will have limited stock available, usually < 100 vehicles.
Use BIGINT (8 bytes) if needed for transactions ID.
Use CHAR(n) for exact length (car vin).
Notes
Optimizing Queries with Integer Identifiers
A creative idea for improving database performance is using INT as an identifier rather than STRING. This method can speed up queries, especially when dealing with large datasets, as comparing integers is computationally faster than comparing strings. However, this approach introduces a tradeoff: identifying what each integer represents when inputting or interpreting data can become confusing.
To address this, we could implement a function to map identifiers dynamically. For example, if a user searches for "Ford," the function would return its corresponding integer value, such as 1, which is then used in the query. The database operations would still use the efficient integer comparison, but the user would work with human-readable inputs.
Example Workflow:
User searches for "Ford."
The system converts "Ford" to INT 1.
The database query compares INT 1 to the entries in the database instead of performing a string comparison.
This setup allows us to transition from a human-readable string to a performance-friendly integer with minimal impact on the user experience.
Why Use This Approach?
In a scenario like a car rental system, querying the database happens far more frequently than adding new vehicles. For instance, analyzing rental behaviour or checking availability is a high-traffic activity, while adding new vehicles happens less often. Optimizing queries by using integers could therefore provide significant performance gains over time.
The Downside
While this approach can save query time, it may introduce additional complexity when inputting or managing data. Maintaining and referencing the mapping between integers and their corresponding strings can become a source of potential errors. For smaller databases or when memory usage isn’t a concern, the optimization may not justify the added headache.
Is It Feasible?
With modern hardware, memory is relatively cheap, and databases are highly optimized for various use cases. Unless the database is exceptionally large or query speed is critical to the user experience, the benefits of this optimization might not outweigh its complexity. However, in high-performance systems or scenarios involving massive datasets, it could be a worthwhile tradeoff.
Conclusion
Using integers for identifiers can be an exciting optimization for search-heavy applications, but it’s essential to weigh the performance benefits against the additional effort required for implementation and maintenance. The choice depends on the scale of the database, the frequency of queries, and the importance of user experience simplicity.
Data Table Information
CarType Table
CarTypeID INT PRIMARY KEY
Increment by 1 every cycle
CarClass TINYINT
Economy
Full
Luxury
CarFrame TINYINT
Sedan
Coupe
Convertible
CabinSize TINYINT
Small
Medium
Full
Engine TINYINT
Gasoline
Electric
Hydrogen
DriveTrain TINYINT
FWD
RWD
AWD
4WD
Trim TINYINT
Base
ParticallyLoaded
FullyLoaded
We are using the full 5 Bytes for numeric (Daily cost, etc), given there is no advantage not using it.
Numeric(p, s) Storage size:
p = 1-9 5 Bytes
p = 10-19 9 Bytes
p = 20-28 13 Bytes
p = 29-38 17 Bytes
p=total length of numbers
s=length of digits saved after decimal point
So we are better off using the full 5 Bytes, hence NUMERIC(9, 2).
Numbered columns means they reference information
Full list of how they are inputted
https://docs.google.com/document/d/1u6Uqw2SkYQRBUQ0bz41EC24wJZYTIU_j7JGiyp3rfao/edit?usp=sharing
Car Table
CarVIN CHAR(17) PRIMARY KEY
CarTypeID INT FOREIGN KEY TO CARTYPE
1
Brand SMALLINT
Ford
ModelName INT
F-150
We can use the google sheets file to document and organize alphabetically/search function for quick reference. This allows us to add any vehicle on the tail end to our database
YearMade SMALLINT
2023
Color TINYINT
Red
Mileage INT
Vehicle mileage in KM
LastTuneUp INT
Last tune up since (KM)
BranchID SMALLINT
BranchNumber
CarInCycle TINYINT
Out of cycle, not available(sold)
An example how ModelName is organized to keep track
https://docs.google.com/spreadsheets/d/1PH1pNVUYcufRgBAfJb-KiW9c81g5KjPVrhasOlF94Yo/edit?gid=0#gid=0
Branch Table
BranchID SMALLINT PRIMARY KEY
BranchID
BranchName VARCHAR(50)
BranchName
AreaCode SMALLINT
780
ContactNumber MEDIUMINT
456-7890
Street VARCHAR(100)
Street address of branch
PostalCode CHAR(6)
Postal Code of branch
City MEDIUMINT
Edmonton
Province SMALLINT
AB
Country TINYINT
Canada
Employee Table
EmployeeID INT PRIMARY KEY
1
BranchID TINYINT FOREIGN KEY TO BRANCH
0
OperatorID BIGINT
12345-678
FirstName VARCHAR(50)
Employee first name
Lastname VARCHAR(50)
Employee last name
Street VARCHAR(100)
#100-123 ST, NW
PostalCode CHAR(6)
T1T 1T1
DOB DATE
2000/01/01
AreaCode SMALLINT
780
ContactNumber MEDIUMINT
456-7890
Password VARCHAR(25)
Employee password
Active BIT
If the employee is active
Note: Employees can only have 1 address
Customer Table
CustomerID INT PRIMARY KEY
1
OperatorID BIGINT
12345-678
FirstName VARCHAR(50)
Customer first name
Lastname VARCHAR(50)
Customer last name
Street VARCHAR(100)
#100-123 ST, NW
PostalCode CHAR(6)
T1T 1T1
DOB DATE
2000/01/01
AreaCode SMALLINT
780
ContactNumber MEDIUMINT
456-7890
Password VARCHAR(25)
Employee password
Note: Customers can only have 1 address
RentalTransactions Table
TransactionID INT PRIMARY KEY
1
RentalCost NUMERIC(9, 2)
rental cost is calculated with tax here
RentedFrom DATE
Date of rental from
RentedTo DATE
Date of rental to
EmployeeID INT FOREIGN KEY TO EMPLOYEE
Employee making the sale for accountability
CustomerID INT FOREIGN KEY TO CUSTOMER
Customer ID information for accountability
CarVIN CHAR(17) FOREIGN KEY TO CAR
Car ID that is being rented
BranchPickUp SMALLINT FOREIGN KEY TO BRANCH
Location the car is being picked up from
BranchDropOff SMALLINT FOREIGN KEY TO BRANCH
Location car is being dropped off
Relational Model
CarType to Car
CarTypeID
One to Many
Car to RentalTransAction
CarVIN
One to Many
Car to Branch
BranchID
Many to One
Branch to Employee
BranchID
One to Many
RentalTransaction to Employee
EmployeeID
Many to One
RentalTransaction to Customer
CustomerID
Many to One
RentalTransaction to Branch
BranchPickup -> BranchID
Many to One
RentalTransaction to Branch
BranchDropOff -> BranchID
Many to One
Inserting Data to Tables
Created all instances of CarType on the table, instead of adding them as we encounter them.
You could add them as you go, which would make the system faster and be able insert the correct daily, weekly and monthly costs.
But by adding all our CarType on the table will enable us to stress test for the end result.
I may bump up the numbers for max.
Automated inserting CarType data into our table
USA Data (5000 entries)
5000 Branch ID
Hertz has 2,909 car rental branches in the USA. We will test with 5,000 branches for good measure.
range(0-4999)
700 Cities
USA has 336 cities over 100,000 population. We will account up to 700 cities that are viable for a car rental company.
https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population
range(0-700)
50 States
range(0-49)
Creating Branches for the USA & Canada
Canada Data (3000 entries)
3000 Branch ID
range(10,000-13,000)
100 Cities
range(1000-1100)
10 Provinces
range(100-109)
Automation for other tables have been added to the GitHub.
Automating: ID # <-> Readable Form
Alternative example for converting INT ID to a human readable form instead of having to reference the excel file each time.
To make life easier, we can extract data from the excel file instead of labouring time to hard code onto our program. Using the function we can print out the data onto the console and copy paste onto our program.
Hard coding in our case is necessary, otherwise we will be opening and closing the excel file each query that involves car_model_id.
Opening and closing costs us efficiency, where holding a tuple of this size is negligible and is stored in ram which is good for quick access.
Here is the final product where we stack our functions together. Where the main function pulls all the necessary info from the car type ID 20.
We can see the results from ID 20:
Car class = 0
Frame = 0
Cabin size = 0
Engine = 1
Drive train = 2
Trim = 2
It is also recommended to create functions going the other-way to increase development. Where searching by car model, ex. A4, would return the car model ID.
TODO
Implement refund for car transaction
Put the car back as available if refunded is processed and car rental is in the future.
Normalize location, ex. city, province and country can be all a seperate table with locationID