Retail Analytics Project

ticketdonkeyAI and Robotics

Nov 25, 2013 (3 years and 11 months ago)

81 views





IE 330



Retail Analytics Project

12/3/11

Daquan Sisco

Sam Resnek

Kevin Connolly

Elliot Hood













Introduction



Open*Mart is a national retail chain that sells a variety of products. Open*Mart operates in
several locations and its small and large retail stores are distributed all over the United States.
The manager of Open*Mart at its Monroeville, PA location want
ed to conduct an initial analysis
of the store. A Business Analytics group was hired to perform the analysis in which the retail
transactions data
related to

two year

sales was dumped into a data warehouse and examined. The
store manager also wanted a de
mographic profile of the customers who visited the Monroeville
store. Having demographic data about the customers who frequented the store would be
extremely useful in the analysis and would allow relationships to be made based on the data.
Having the t
ransaction data gives information about the items, the customers who bought these
items, as well as the vendors who supplied these items.
This data also allows the Business
Analytics group to summarize the information on each kind of item with respect to
the vendor,
with respect to time, and with respect to the quantity of the items. The demographic data
contains information like: family size, gender, ethnicity, age, occupation etc. This data creates
opportunities for relationships to be built.
A prime

example of these relationships would be
the
relationship between the customers and the ty
pes of items that they bought. Another relationship
would be the relationship between time and sales volume to observe any trends on a daily or
weekly basis. The bu
ying patterns of different customers could also be analyzed in order to
identify trends and make relationships. For example, a certain item could be bought only by low
or mid income customers which means that clustering techniques could be used to group
c
ustomers based on their preferences.

The CEO of the Open*Mart, enquired about the status of
the Monroeville store and wanted all of these trends to be included in a report which would be
presented to the board of directors of the company.

Our job as a gro
up was to
act as the Business Analytics group and produce a full study/analysis
of the Open*Mart at the Monroeville, PA location. After conducting the analysis we would
report our findings back to the store manager who would in turn send in the report to
the CEO of
Open*Mart. Quality and thoroughness is an important aspect of the analysis and mistakes could
lead to a loss in funds for the company while identifying trends and creating relationships could
also mean big gains financially.




Project
Objectives


The primary goal of the project is to make recommendations for the store bases on the retail
information that was collected in the analysis.

In order to do so, a database schema would need
to be created as well as an ER diagram that would incl
ude all of the necessary relationships,
entities, primary keys etc. ER diagrams help to organize these relationships and give a better
understanding of how the data relays to each other.

Queries will need to be created based on the information provided.
The queries will help to
identify the specific trends in the data and allow us to make accurate recommendations for the
company. Snapshots of the SQL of the queries will be useful in the future for reference
purposes. Also, the queries will help in makin
g decision involving the clustering of data.

A time series analysis will need to be created that will display the total units sold of each item
type.

This will also help in the clustering aspect of the project because it will help make it easier
to make r
ecommendations and identify trends. The clustering of the customers will be mainly
based on their buying patterns over the 104 weeks. Based on the clusters/groups of customers
that are obtained from the algorithm, any commonality (if any) will need to be

identified.

All of the data that is collected will need to be imported into excel so that graphs can be created
and analyzed. Access database tables will need to be created as well as snapshots of these tables.


Problem Description


The CEO of Open*Mar
t, Dr. Liying from China, has enquired to Mr. Yoo abou
t the status of his
Monroeville store. In order to perform analysis of the store, Mr. Yoo hired Mr. Reddy of
Business Analytics to gather data about customer demographics and transactions of a two year

span. Transaction data

includes Customer ID, Item Type, Item Number, Vendor ID, Week, Day,
and Units Bought.

Mr. Reddy contacted Mr. David, the Customer Relationship Manager, to
collect demographical data. Demographic data includes
Customer ID, Family s
ize, Income,
Ethnicity, Dogs, Cats, TVs, Age, Children, Work Hours, Occupation, Education, and
Subscription to magazines or
newspapers.

Dr. Liying has stated some questions in specific that she would like answered. These
specific questions include forming
an access database containing columns for analysis, to export
relevant data into Excel and perform a trend analysis and visualization, and use a k
-
means
clustering algorithm to cluster customers based on their buying patterns.

In addition to performing t
he analysis, extended insight should be put into the data established.
This insight should provide Open*Mart with understanding into how to promote products
according to specific customer attributes. Any further analysis that might help upper
management m
ake profitable decisions is considered beneficial.

Methodology


There were many processes conducted during this study to gather the necessary data needed to
make recommendations as well as identify different trends.

First an ER diagram was created
based o
n the data that was given. The ER diagram illustrated the relationships between the
entities within the database.

There are four main steps involved in creating an ER
diagram:

1
)
Identify

the entities, 2
) Determine

all significant interactions, 3
)
Analyze

the nature of the
interactions, 4
) Draw

the ER diagram.

The boxes are used to represent the entities, diamonds are
normally used to represent the relationships, and the ovals are used to represent the different
attributes.

The
cardinality

defines

the relationship between the entities in terms of numbers. An
entity may be

optional
: for example, a sales rep could have no customers or could have one or
many customers; or
mandatory
: for example, there must be at least one product listed in an order.
The three main cardinal relationships are: one
-
to
-
one, expressed as 1:1; one
-
to
-
many, expressed
as 1:M; and many
-
to
-
many, expressed as M:N.

The patterns help focus on how the database
actually works with all of the interactions and data flows
.
Several qu
eries were needed in order
to pull out certain data needed to have a basis for our assumptions. For example, we made
queries that would show the top 5 items sold and the bottom 5 items sold for each season during
the 104 weeks. This would allow us to see

which items were sold based on availability or
customer preference during those seasons and also allow us to make an assumption on what
items would be essential to have in stock during those specific seasons. Another query that was
made was the TotalCoup
onWorth query which allowed

us to identify the coupon that was used
the most and the amount of sales that it accrued in cents. The graph that was imported into excel
based on this query was converted into dollars in order to make it easier to make any fut
ure
calculations.

The items were also sorted into the top items that were sold by day, for each day of
the 104 weeks. This would allow us to make an assumption on whether or not certain days
affected customers buying preferences.


Another important part
of performing the analysis was grouping customers based on buying
habits. To group the customers into groups with similar buying patterns, K
-
means clustering was
performed. K
-
means clustering provides groups of customer with similar buying patterns based

on individual item purchasing. K
-
means clustering only allows the comparison of the purchasing
amounts of two different items so these items were carefully selected. Initially is made sense to
pick the most sold item and the least sold item because the
final goal involves relating customer
demographics to these purchasing patterns and the making a recommendation to improve all
aspects of Open
-
Mart. It was assumed that comparing these two items would provide the
demographics of the customers purchasing t
he most sold item and provide for a solid
recommendation to target these customers with advertising of the least sold item. Using these
demographics could also provide information as to why these items are the most and least sold
and allows conclusions to

be drawn as to how to improve sales of the least sold items, determine
better store location of the item to better target customers, etc. Unfortunately the least sold item
had numbers that were too low to provide proper clustering proving this prediction

to be
incorrect. The second decision was to compare the buying patterns of two items that ranked in
the top five most sold items. The next step is to plot all of the points on a given x
-
y scatterplot
with the x
-
axis being (in this case) the purchasing a
mount of a given item. The y
-
axis is the
purchasing amount of the other item. These points then need to be clustered based on the buying
pattern of the customer. Simply put all of the customers that purchase a lot of item A and not a
lot of item B will
be in the same group and so forth. In order to determine the clusters an initial
centroid is chosen for each cluster. The cluster amounts must be pre chosen based on desired
analytical results. If there are six clusters chosen there will be six initial
centroids placed
throughout the scatterplot. The distance from each centroid to a surrounding point is calculated
and then those two points become a cluster. A new centroid is calculated using the average of
the two points and the next distance compariso
n is based off of the new centroid. This continues
with a point being chosen and the closest centroid to this point used to calculate distance and
then recalculate the new centroid. The process is complete when no more points are left to
choose from and
therefore they all have a designated cluster. These clusters are a representation
of similarity between the purchasing amounts of the selected items. The second decision of
comparing items ranking in the top 5 purchased items proved to be successful as e
ach of the
clusters were spread out through the scatterplot and provided a large difference in the purchasing
amounts of the items between each cluster.

To add to the analysis of the clusters box and whisker plots were used to see different statistical
data pertaining to each cluster and the purchasing amounts of that cluster per item. These plots
provide for a much better visual analysis of the data and
making better decisions as to relating
the demographics of the cluster to the purchase amount of the two items compared.

Once the cluster organization was completed the customers were again organized per cluster and
their demographic data was compared.

This provides for relationships to be determined based on
buying patterns, which allows for proper cycling of coupons, item placement in the store, and
other improvements that can be made to improve sales for Open
-
Mart. These improvements are
gone into m
ore detail in the recommendation section of this report.


Group Members and Roles


There were four people in our group and we divided the work load equally amongst ourselves.
Samuel Resnek was in charge of creating the queries that were needed to identi
fy the different
trends in the data that was provided. The queries like top 5 and bottom 5 items sold revealed
relevant information necessary to develop trends. Daquan Sisco was in charge of taking all of
the information that was yielded by the queries a
nd transferring it into Excel in order to create all
of the graphs that would visually display the trends that were identified by the queries
. He was
also in charge of creating the ER diagram that illustrated the relationships within the database.
Kevin
Connolly was in charge of what completing the K
-
means clustering as well as the
demographic analysis of the data, based on the information that found. Elliot Hood completed
the box and whisker analysis based on the different items.



Database Design


In
creating our database, we created 5 tables to successfully organize the given data. The tables
we created are as follows: Customer, Transaction, Media, Product and Coupon. Snapshots of
each table and their attributes along with a few rows data are shown on

the next page.








Next, we added our relationships on Access to produce a tree diagram of our database. We also
made an ER diagram that will show all of the primary and foreign
keys, cardinality,
the
relationships

and the participation of each

between our tables. These two items are pictured
below.




The next step in analyzing the data from the past two years for Open*Mart was creating
meaningful queries that will help us to identify trends of sales over the past two years. The
queries and a

description of what each does is located here:


Top 5 Items:

This query selects the top 5
items in terms of sales for the whole 2 year
data period.



Bottom 5 Items: This query selects the
bottom 5 items in terms of sales for the 2 year
data period.



Item By Day: This query evaluates the top and bottom 5 items that we identified in the previous
queries to see how much of each item was purchased on any given day of the week for the 2 year
data period.



Item By Week : This is a series of 10 queries th
at were generated to see how many of each type
of item were sold week by week for the 2 year data period.





































Item By Season: This series of 4 queries were made so that we can determine how many of each
of our 10
focus items were sold during the seasons of winter, spring, summer and fall over the 2
year data period.









0
1000
2000
3000
4000
5000
6000
7000
8000
Total Item Sales

Items

Top and Bottom 5 Items




Total Coupon Worth: This query is used to
sum the value of a coupon used from the
different coupon origins. This can be used to
see
where most of the coupons that were used
originated from to help improve our
advertising scheme.


Results


The queries that were made based on the data given yielded a variety of results that could be
used to identify trends and make assumptions for the co
mpany. The results were categorized
into different time sets to further distinguish the sales since it wasn’t clear at first.



The top five items that were sold over the period were snack, eggs, butter, cook, and cereal. The
bottom five items that were
sold over the period were pizza, soft, nuts, cleansers, and pills.






0
200
400
600
800
1000
1200
1400
1600
1800
Item Sales

Items

Total Items Sold
-

Winter

0
500
1000
1500
2000
Total Sales

Items

Total Items Sold
-
Summer



Out of the total items that were sold in the winter the top five items were butter, cereal, cook,
eggs, and snack. The bottom five items that were sold during the winter were
cleansers, nuts,
pills, pizza, and soft












Out of the total items sold this summer the top 5 items were butter, cereal, snack,
cook, and eggs.
The bottom five

items were cleansers, nuts, pills, pizza, and soft











We also broke down the
top five items in terms days. It is clear that the snack item was most
popular during the weekend and that butter was consistently popular during the entire week.












Along with the top five items of the day we also organized it into the bottom 5 i
tems by day.
From the graphs we can tell that the soft item was consistently popular from Sunday
-
Saturday
.













0
200
400
600
800
1000
1200
1400
Items Sold

Day

Top 5 Items By Day

Butter
Cereal
Cook
Eggs
Snack
0
50
100
150
200
250
Items Sold

Day

Bottom 5 Items By Day

Cleansers
Pizza
Soft
Nuts
Pills
We felt that it would be helpful to also organize all of the items by the amount of sales that they
accrued
weekly
.





0
50
100
150
1
13
25
37
49
61
73
85
Item Sales

Week

Butter
-

Sales by Week

Butter
0
20
40
60
80
100
1
13
25
37
49
61
73
85
Item Sales

Week

Cereal
-

Sales By Week

Cereal
0
2
4
6
8
10
12
1
13
25
37
49
61
73
85
Item Sales

Week

Cleansers
-

Sales By Week

Cleansers
0
50
100
1
13
25
37
49
61
73
85
Item Sales

Week

Cook
-

Sales By Week

Cook
0
2
4
6
1
11
21
31
41
51
61
Item Sales

Week

Pills
-

Sales By
Week

Pills
0
50
100
150
1
13
25
37
49
61
73
85
Item Sales

Week

Eggs
-

Sales By Week

Eggs
















0
10
20
30
40
1
13
25
37
49
61
73
85
Item Sales

Week

Pizza
-

Sales By
Week

Pizza
0
5
10
15
20
25
30
1
13
25
37
49
61
73
85
Item Sales

Week

Soft
-

Sales By Week

Soft
0
2
4
6
8
10
12
14
1
13
25
37
49
61
73
85
Item Sales

Week

Nuts
-

Sales By Week

Nuts
0
20
40
60
80
100
120
1
13
25
37
49
61
73
85
Item Sales

Week

Snack
-

Sales By Week

Snack
The coupons were a pivotal part of the information that was provided. Based on the information
the Sunday Supplement Vendor and the Newspaper Ad Store were the most used coupons.


To make it easier we graphed two items that showed more of stab
ility in sales throughout the
seasons. Cleansers an eggs were items that showed strong sales all year round.












0
200
400
600
800
1000
1200
1400
Total Coupon Value (Dollars)

Coupon Origin

Total Coupon Usage











We also graphed two items that showed a trend of selling more during a particular season. Pills
seemed to sell more during

the winter, while snacks seemed to sell more during the summer.

























Analytics

0
50
100
150
200
250
300
350
0
50
100
150
200
250
Total Amount of Snacks Purchased

Total Amount of Eggs Purchased

Clusters of Snack and Eggs Purchased

Cluster Centroids
Cluster A
Cluster B
Cluster C
Cluster D
Cluster E
Cluster F
0
50
100
150
200
250
300
350
0
50
100
150
200
250
Total Amount of Snacks Purchased

Total Amount of Eggs Purchased

Clusters of Snack and Eggs Purchased

Cluster Centroids
Cluster A
Cluster B
Cluster C
Cluster D
Cluster E
Cluster F
The graph above displays the different clustering between the purchasing habits of customers
based on the items Snack and Eggs. The clusters are much more obvious and uniform as
compared to those of the unusable comparison between pills and snacks. This
grouping of the
customer buying habits can then allow for demographic comparison in order to make proper
recommendations to Dr. Yoo.












0
25
50
75
100
125
150
175
200
225
250
275
300
325
Eggs
Snack
Eggs
Snack
Eggs
Snack
Eggs
Snack
Eggs
Snack
Eggs
Snack
Cluster A
Cluster B
Cluster C
Cluster D
Cluster E
Cluster F
Total Amount of Items Bought

Items and Clusters

Boxplots for Eggs and Snack
Organized by Cluster

The chart on the previous page displays box and whisker plots for both eggs and snack organized
by the corresponding cluster. Eggs and snack were chosen because they were both in the top 5
of the most items sold overall. It was also determined that eggs
and snack are purchased by all
customer and cluster and statistical analysis will provide a better snapshot of the entire customer
base. Clusters A, C, E and F all have similar item amounts purchased and the demographics of
these customers will be compare
d in order to gain a trend. Clusters B and D had different
purchase amounts between the two compared items and their demographics will also be analyzed
to determine trends between these customer bases.



The graph above displays box and whisker plots

for Snack and Pill based off the total amount of
items purchased by customers in newly determined clusters. Snack and Pill were chosen because
snack is the most purchased item by all customers and Pill is the least purchased item. The idea
was to use cu
stomer demographics from the most purchased items and pair them with the
customer demographic of the least purchased items. The clusters attempt to group customers
based on the amount of items purchased and display any relationships in the purchasing habi
ts.
For example, a group of customers might buy a large amount of snacks and a large amount of
pills. It could be concluded that customers that buy/eat a lot of snacks are more likely to get sick
0
25
50
75
100
125
150
175
200
225
250
275
300
325
Snack
Pill
Snack
Pill
Snack
Pill
Snack
Pill
Snack
Pill
Snack
Pill
Cluster G
Cluster H
Cluster I
Cluster J
Cluster K
Cluster L
Total Amount of Items Bought

Items and Corresponding Cluster

Boxplots for Snack and Pill Organized by
Cluster

and need to purchase pills. Recommendations could be made

based on coupon offers and layout
placement of these items for optimal product sales. Unfortunately the purchasing numbers for
pills are too low and caused improper clustering. As can be seen on the box and whisker plots
for every cluster the median pur
chasing amount for pills was zero. This caused the customers to
be grouped solely on the amount of snacks they purchased providing no usable relationship
between customers purchasing the most popular and least popular item. When viewing the
Clusters of P
ills and Snack purchased above, it can be seen that most of the clusters are grouped
based off of the x
-
axis (or snack purchased). Most of the data points are grouped in the bottom
left corner of the chart with data points from five different clusters all

in the same are. This
proves the data is unusable for out particular analysis. Although this data is unusable it is
important to mention to fully rule out the purchasing relationship of the most and least purchased
items.



0
1
2
3
4
5
6
7
0
50
100
150
200
250
300
350
Total amoutn of Pills Purchased

Total amount of Snack Purchased

Clusters of Pills and Snack Sold

Cluster G
Cluster H
Cluster I
Cluster J
Cluster K
Cluster L
Recommendations


Based o
n the queries, data clustering, and graphs that were produced as a result of studying data,
there were several recommendations that we could make that could possibly increase the stores’
profit.

One of the most obvious recommendations that we determined wa
s that certain items sold more
during particular seasons. For example, the pills had their highest sales during the winter, while
snacks were sold more during the summer. Pills would sell more during the winter because
during this time more people become

ill with sickness and it is also the height of flu season
during the winter
. Snacks would sell more during the summer because during this time of year
people have parties and more children are out of school. It would be wise for the store to stock
up on

these two items during the seasons in which they sell the most items. These items should
be placed at the end of isles, near cashiers, or wherever customers would be more inclined to buy
these items.

Recommendations could also be made on the coupons that

were used in transactions within the
store. The Sunday Supplement Vendor Coupon and the Newspaper Ad Store coupon were the
coupons that were utilized the most by customers. They were the coupons that brought in the top
sales in cents. Basing our advert
isements on these coupons would be a profitable way to
promote the store and increase sales. If customers see that there are coupons that they could use
that would save them the most money on items that they want, they would be more inclined to
visit the
store and also motivated to spend more money.

Many items that we observed in the data displayed constant sales over the course of the entire
period. For example, Cleansers and Eggs were items that were in constant demand by the
customers throughout each s
eason. Since these are items that customers are always going to
want it would be smart to make sure that the store always has a ready supply of these specific
items. It would also be profitable to have sales on these items since it would spark customers
to
buy more of the items, which would in turn lead to more profit.

Focusing on the clustering of the customer groups, there is a variety of purchasing habits
displayed throughout the clusters. This is consistent with a healthy business as you have
customers that come in everyday and consistently purchase the same items (
or a consistent large
variety of items) as well as customers that come into the store and only buy one item for the
entire year. The clustering provided for organization of these customers, which will allows for
analysis of the demographics in order to se
e trends in the purchasing habits. To begin, Cluster F
displayed customers with an even buying pattern of eggs and snack. These two items rank in the
top 5 of the most purchased items which allows for reasoning as to why they are purchased so
much. Clus
ter F is the ideal customer as they purchase a relatively even amount of both of the
items and visit the store at a consistent rate to purchase these items. This is ideal because the
customer is much more predictable and therefore stocking and vendor purc
hasing can be better
scheduled.

The first demographic attribute analyzed in Cluster F was the household ownership of cable TV.
Of the 18 customers in the cluster only 7 have cable TV. This result of basically a 50
-
50
relationship proves that cable TV d
oes not sway a customer to a consistent buying pattern similar
to that of Cluster F. Open Mart commercials would not be something economically smart in
order to push customers to this more consistent buying pattern. Another demographic attribute
analyzed

is whether or not the customer has children in their household. Of the 18 customers in
Cluster F only 1 did not have any children which displays a strong similarity between the
consistent buying customers of Cluster F and having children. There are many

ways to target
children in a household via marketing just in the store. Because this specific cluster analysis
focuses on Eggs and Snacks and it is known that customers with children are more likely to
consistently buy both, the layout in the store of th
ese two items could be varied to increases sales
to these customers. Knowing the customer will most likely buy both items in a visit to Open
-
Mart, these items could be strategically placed far away from each other. This would require
these customers to t
ravel from one side of the store to the other in order to obtain both items and
exposing them to several more aisles and advertises on their way. Another layout
recommendation would be to place the child’s toy section or school supplies on the way from
sn
acks to eggs to encourage the customer to purchase toys and/or school supplies for their child.
The customers in Cluster F also display a 50
-
50 relationship between whether they have a
newspaper subscription or not, which proves newspaper ads would not en
courage customers to
purchase more consistently or visit the store more. The Open
-
Mart name is strong enough that to
attract this type of customer base there is a small need for advertisement. Another important
demographic attribute for this type of cust
omer is their income. The range for this particular
cluster was anywhere from an income below $10,000 per year and over $75,000 per year.
Because there is no relationship in the income it’s hard to make good recommendation to
improve the purchasing patte
rn of these customers. With this inconsistent income level the best
recommendation is to increase the amount of coupons handed out and mailed to these customers.
Coupon usage was not very consistent and very minimal, and the usage of a coupon will
encour
age customers to buy more of a given item and even to buy an item they might not
normally buy. All customers want to save money and the introduction of more coupons will be
economically beneficial for a solid future customer base.

Another consistent clu
ster was cluster C, which showed consistent purchasing amounts for each
of the items compared, but displayed a smaller overall amount of items purchased. At first
glance the average of the family size of these customers was 2.65, which translate to betwee
n 2
and 3 people that purchase eggs and snack at a consistent yet low rate. It is obvious that these
customers require less of these products so therefore will purchase much less. They have a
similar 50
-
50 relationship between having and not having both
cable TV and a newspaper
subscription. To improve the amount of items purchased advertising should therefore be used in
both newspaper ads and on television, but should not bombard the customer in order to get them
to the store. The customer base is alre
ady strong so therefore an expensive campaign similar to
that of Geico would prove to be economically pointless with a large cost and a likely small
increase in customer base, versus a discrete campaign using newspapers and short TV ads.

The clustering of
the purchasing amounts of eggs and snack between the customer base proved
successful in determining strategic placement of these items in the store as well as offering
coupons and ads for the store. Just this clustering has an endless amount of informatio
n and in
the allotted time only a small amount of information was gathered compared to the potential of
clustering. More analysis could include different clusters of items and more analysis of those
specific customers.

Customer Demographic Analysis

Table.
: Monroeville Open*Mart
-

Average Customer

Our Average
Customer

Mean

Mode

Family Size

2.9

2

Income

34751

40000

Ethnicity

N/A

White

Dogs

0.4

None

Cats

0.2

None

TVs

2.05

2

Newspaper
Subscriber

53% NO

53% NO

Male(Age)

61.1 Yr

65+

Female(Age)

59.3 Yr

65+

Children

98.8% Have
Kids

68% Are
18+

Cable TV

2+ Sets

57.1% Have

Male Occupation

Professional/Ma
nager

30 %
Retired

Female Occupation

Professional/
Clerical

27%
Retired



Based on the data received from Mr. David, the Customer Relationship Manager, average
demographic attributes were found and are displayed in the table above. This information can
be very beneficial to the store for a wide variety of things such as market
ing schemes, logistics,
and target customers. A further, more specific, breakdown of all customer attributes can be
found in the Appendix.

Bulk Sales

Out of the 494 customers in our demographic database, an overwhelming amount can be broken
down into a fe
w different groups or clusters. For example, almost 75% of all of our customers
have a family size of 2 or 3 and only 17% of our customers come from families of 5 or more
people. This is especially useful to utilize selling more low quantity products as
opposed to
selling items in bulk.

Offering items in bulk may not always be the wrong choice. Our number 1 selling product,
Snacks, virtually double in sales during the summertime. This is more than likely attributed to
our customers having picnics and p
arties. Monroeville Open*Mart should offer bulk packages of
snacks during the summertime. This could save a lot of costs due to shipping, handling, and
storage. Customer appreciation would also be expected to increase. If bulk sales show to be
profitab
le, it may be a possibility to sell Snacks in bulk year round which could result in even
more savings.

Cat Food

Even though only 16% of our customer base owns a cat, cat food sales are still relatively high.
This means that majority of the 79 customers wh
o buy cat food from Open*Mart are repeat loyal
customers. With the amount of business that they bring to the store in cat food sales alone, we
should offer coupons or specials for the cat food to show customer appreciation. As these
customers continue to

buy cat food from us and enjoy the discounted price, they are likely to buy
or continue to buy their other products from our store.

Overall Marketing

In order to entice our customers to come shop at Open*Mart, we need to tell them about our
unbeatable pri
ces. Even though Newspaper coupons and marketing have proven very effective,
there may be a better option. Out of our 494 customers, a whopping 97% own at least 1 TV.
Considering that about 30% of our customers are retired, 98% have children, and 125 wo
men are
unemployed (possibly stay
-
at
-
home moms), our market is prime for TV advertising. Though if
TV advertising is considered, it should be on basic cable that everyone receives as only 53% of
our customers have an extended cable package. Since the gro
up that will be watching TV the
most is relatively specific, we can also determine what types of products to advertise. For
instance, we could market a lower selling item such as soap to our customers during daytime TV.
Stay
-
at
-
home moms and Retirees gen
erally buy the soap for their household and this should
entice them to buy from us.

Insights into the Future

This analysis has shown how valuable information can be, but just having information is not
enough to generate profitable results. The real
benefits are realized when data is manipulated to
show underlying trends, clusters, and groups. Knowing what items different customers tend to
buy, what time of the day, week, or year they’re sold, and other ambiguous trends are an
essential tool in maint
aining profitability and continuing the enterprise’s growth.

Future insights for Open*Mart and data manipulation in general are very widespread. The sky
really is the limit as to what can be done with enough data. Open*Mart should continue to
collect dat
a about their customers and transactions. Having the latest information about their
sales is crucial to good business. If Open*Mart has budget for an Industrial Engineer, or some
type of analyst or consultant, benefits would be great. A manager may not
have the necessary
skill set needed to do the analysis. Just from this data, it is easy to see that someone hired to
discover data trends would more than pay for their salary.


References

1)

The ER diagram that we created during the study was used frequently

as a reference.



2)

Kumara, Soundar. “
Relational Data Model”. The Pennsylvania State University,
Pennsylvania. IE330 Fall 2011.Lecture

3)

Kumara, Soundar. “Entity
-
Relationship Data Model”. The Pennsylvania State
University, Pennsylvania. IE330 Fall 2011.Le
cture

4)

Kumara, Soundar. “Data Model”. The Pennsylvania State University, Pennsylvania.
IE330 Fall 2011.Lecture

5)

Kumara, Soundar. “Database Systems”. The Pennsylvania State University,
Pennsylvania. IE330 Fall 2011.Lecture

Appendix

These are the Customer
Demographics we used to make recommendations.