Data Science And Statistical Modelling In Space And Time
Data Science And Statistical Modelling In Space And Time
Data Science And Statistical Modelling In Space And Time
Assignment
This assignment consist of three sections. Section A consists of spatial modelling questions, Section B consists
of time series modelling questions, and Section C is a project containing a report.
You should submit a pdf containing your answers to Section A, Section B, Section C (Questions 1-6) and
your report for Section C Question 7.
For A, B and C (Q1-6) commented R code (and the outcomes/plots) should be part of your answers.
The report for Question C7 should be no more than 6 pages (where figures/tables/code can be in an appendix
that does not contribute to the page limit) and have the following sections:
• Introduction – explaining the rationale behind the analysis, e.g. what questions are you aiming to
answer.
• Initial Data Analysis – provide the reader with graphical and numerical summaries that highlight any
spatial and temporal patterns.
• Methods – describe the methods that you are going to use for your main analysis, highlighting why
your choices are appropriate.
• Results – present the results of your analyses together with a clear narrative that will help the reader
understand the results.
• Summary – summarise the key findings of your analysis, what were the answers to your questions?
• Bibliography – listing any papers and/or online sources that you reference in your report.
• Appendix – R Code.
The deadline for submission is 12 noon, 9th August. Submission is via ELE.
A. Spatial modelling [100 marks]
You have just started work at an oceanographic consultancy. You are asked to interpolate a set of sea surface
temperature data for one month in the Kuroshio off Japan onto a grid with a resolution of .5° in both the E
and N directions. We are going to assume a flat Earth!
The data are in the file kuroshio.csv on ELE. An R program to read the data (readkuro.R) is also on ELE.
1. Produce numerical and graphical summaries of the data. Comment on your findings and highlight any
potential outliers in the data. [10 marks]
2. Check for isotropy (the function variog4 in geoR may be useful). Do you need a trend in the model?
[20 marks]
3. Decide what spatial model you want to fit. You may want to try several and see which one fits best.
Estimate the parameters of your chosen model by Maximum Likelihood and plot the expected value
and variance for the estimate on the required grid. Validate your model. [35 marks]
4. Repeat 3 but use Bayesian methods. [25 marks]
5. Comment on the difference between the two methods of estimation. [10 marks]
Note: fitting Gaussian processes becomes significantly more expensive as the number of data points increases.
You may want to consider fitting models to a subset of the data for computational efficiency (consider how
you might want to split the data, how you might use the left-out data).
1
B. Time series modelling [100 marks]
1. The figures labelled A to E show five time series whose defining equations are given below.
i) Xt = 0.8Xt−1 + t,
ii) Xt − 3Xt−1 + 3Xt−2 − Xt−3 = t − 0.8t−1,
iii) Xt = −0.8Xt−1 + t,
iv) Xt = 0.4Xt−1 + 0.6Xt−2 + t.
v) Xt = t + 0.05tt−1,
State, with reasons, which equation corresponds to which plot. [10 marks]
The data for this assignment are the measured strength of the overturning in the North Atlantic from
moorings at 26N between April 2004 and March 2014, found in file overturning.csv.
a. Average the data to quarterly means. Produce numerical and graphical summaries of the averaged data,
and comment on your findings and highlight any potential outliers. You might find it useful to convert
the averaged data to a time series object ts(). [10 marks]
b. Fit an ARMA and an ARIMA model to the data. Choose the most appropriate model, and use this to
predict the values for the six 3-month periods from April 2014 to September 2015. [30 marks]
c. Fit a DLM to the data (including both a trend and a seasonal component). Use your model to predict
the values for April 2014 to September 2015. [30 marks]
d. Compare the results of parts b and c, and comment on any differences you may find. [10 marks]
Create an order via https://peakassignments.com/order if you need work on such topic and many more from different disciplines.