Vinho Verde
Milestone 1 project
.
UNSW
COMM5000
T3 2024
Vinho Verde project summary
In recent years, the increasing interest in wine has driven the expansion of the wine
industry. Consequently, companies are investing in new technologies to enhance both wine
production and sales. Quality certification is crucial in these processes and currently
depends heavily on wine tasting by human experts.
You are consulting for a winery, helping them predict or estimate human wine taste
preferences during the certification step. Knowing the wine quality will enable the winery to
better predict available quantities and annual sales. It will also support the oenologist’s
wine tasting evaluations by potentially improving the quality and speed of their decisions,
thereby enhancing wine production. Additionally, similar techniques can aid in target
marketing by modeling consumer tastes from niche markets. To predict wine quality, you
will use a dataset consisting of 4,898 white and 1,599 red vinho verde samples from
Portugal’s northwest region, along with the statistical methods covered in this course.
IN THIS PROJECT (M1, M2, and final report) YOU WILL:
As a consultant/data analyst, do a preliminary Exploratory Data Analysis over the
dataset. You are expected to find or reveal possible properties, characteristics,
patterns, and statistics hidden in the datasets. This is what you will do in
Milestone 1. The results from your M1 may be used for the analysis in M2 and in
the final goal of predicting wine quality.
Synthesise your potential insights from the M1 and M2 and construct a model that can predict
wine quality, using exclusively and only the variables provided in the dataset. More on this in
the final report.
Based on your analysis, deliver your findings to the winery you are consulting.
THE DATA
Dataset file: This assessment requires the download of the Excel file
provided on your course Moodle page (file name: ”Vinho_Verde.xlsm”). The
dataset is related to red and white variants of the Portuguese “Vinho Verde”
wine.
For more details, consult: or the
reference [Cortez et al., 2009] which can be accessed from
.
WHAT IS NEXT?
• Note: BEFORE opening the file, right-click on the Excel file and select
Properties – General tab. Under security, ensure Unblock is checked and click
Apply.
WHAT IS NEXT?
• Depending on your computer and software setup, this step may be required
as Office blocks macros code by default. These specific Excel file datasets
require macros to be enabled to function correctly and allow for data
randomisation. For consistent file use, it is advised that you:
1. Download file ”Vinho_Verde.xlsm” from our Moodle site
2. Unblock the macro via Properties – General (see previous slide)
3. Open and save the downloaded dataset to a known location, so you can
continue to work from the same file
1. When you save the file, keep the same XLSM extension
4. Only use the file you have opened and saved at step 1.
Do not download your Excel dataset file multiple times from Moodle, as this will
result in you working with different numbers.
WHAT IS NEXT?
• Save the spreadsheets to your own device where you will
use Excel
• Note: This is an individual work, and worth 15%. You will
have to submit your OWN work on the due date
• Cheating and any other academic integrity violation are
not tolerated. We enforce a strict zero-tolerance policy,
and any breach could result in severe consequences.
M1 Activities
• If you do not upload your report in time, you will not be able to provide
feedback on the M1 reports, and you will not be able to participate in
the Feedback on feedback activity.
• You will be grouped into teams of 4/5
• The teams will be used to setup peer marking of the individual M1
submission (feedback on feedback: 5%)
• Important to leave high-quality, constructive feedback so that your
effort can be rightfully rewarded
• Your feedback will be assessed and you will also have the opportunity to
evaluate the feedback you have received.
M1 Activities
• Timelines:
• Week 4, Friday, deadline for submission of M1
• Week 6, deadline for marking the submissions you were assigned to
• Week 7, deadline for evaluating the feedback you have received
• 15% for M1 report:
• Important to submit on time
• Time allocation for ELS students
• 5% for the quality of feedback provided (feedback on feedback)
• An opportunity to learn from each other, and provide honest,
constructive feedback
• Computed as average of all feedback on feedback received
• Teaching team can intervene to address any issue
Milestone 1 Tasks
– For each variable included in the dataset:
– compute central tendency and dispersion statistics
– Do this for the entire sample, red, and white separately
– See example below
– Use charts for graphical summary/presentation
Milestone 1 Tasks
– Correlation analysis: an analysis of the correlation between
– the wine quality and alcohol content
– This analysis can be done with a description of the
relationship between wine quality and alcohol content
using the entire dataset (i.e. using both red and white)
– This can be then repeated for red and then for white
– Charts should be used to illustrate your findings
– For M1 this is just a qualitative analysis
Questions?
1) Check discussion forums
2) Post on discussion forums
3) Email your lecturer or tutor
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12: Questions? 1) Check discussion forums 2) Post on discussion forums 3) Email your lecturer or tutor