# Data Science And Statistical Modelling In Space And Time

Data Science And Statistical Modelling In Space And Time

Data Science And Statistical Modelling In Space And Time

Assignment

This assignment consist of three sections. Section A consists of spatial modelling questions, Section B consists

of time series modelling questions, and Section C is a project containing a report.

You should submit a pdf containing your answers to Section A, Section B, Section C (Questions 1-6) and

your report for Section C Question 7.

For A, B and C (Q1-6) commented R code (and the outcomes/plots) should be part of your answers.

The report for Question C7 should be no more than 6 pages (where figures/tables/code can be in an appendix

that does not contribute to the page limit) and have the following sections:

• Introduction – explaining the rationale behind the analysis, e.g. what questions are you aiming to

answer.

• Initial Data Analysis – provide the reader with graphical and numerical summaries that highlight any

spatial and temporal patterns.

• Methods – describe the methods that you are going to use for your main analysis, highlighting why

your choices are appropriate.

• Results – present the results of your analyses together with a clear narrative that will help the reader

understand the results.

• Summary – summarise the key findings of your analysis, what were the answers to your questions?

• Bibliography – listing any papers and/or online sources that you reference in your report.

• Appendix – R Code.

The deadline for submission is 12 noon, 9th August. Submission is via ELE.

A. Spatial modelling [100 marks]

You have just started work at an oceanographic consultancy. You are asked to interpolate a set of sea surface

temperature data for one month in the Kuroshio off Japan onto a grid with a resolution of .5° in both the E

and N directions. We are going to assume a flat Earth!

The data are in the file kuroshio.csv on ELE. An R program to read the data (readkuro.R) is also on ELE.

1. Produce numerical and graphical summaries of the data. Comment on your findings and highlight any

potential outliers in the data. [10 marks]

2. Check for isotropy (the function variog4 in geoR may be useful). Do you need a trend in the model?

[20 marks]

3. Decide what spatial model you want to fit. You may want to try several and see which one fits best.

Estimate the parameters of your chosen model by Maximum Likelihood and plot the expected value

and variance for the estimate on the required grid. Validate your model. [35 marks]

4. Repeat 3 but use Bayesian methods. [25 marks]

5. Comment on the difference between the two methods of estimation. [10 marks]

Note: fitting Gaussian processes becomes significantly more expensive as the number of data points increases.

You may want to consider fitting models to a subset of the data for computational efficiency (consider how

you might want to split the data, how you might use the left-out data).

1

B. Time series modelling [100 marks]

1. The figures labelled A to E show five time series whose defining equations are given below.

i) Xt = 0.8Xt−1 + t,

ii) Xt − 3Xt−1 + 3Xt−2 − Xt−3 = t − 0.8t−1,

iii) Xt = −0.8Xt−1 + t,

iv) Xt = 0.4Xt−1 + 0.6Xt−2 + t.

v) Xt = t + 0.05tt−1,

State, with reasons, which equation corresponds to which plot. [10 marks]

The data for this assignment are the measured strength of the overturning in the North Atlantic from

moorings at 26N between April 2004 and March 2014, found in file overturning.csv.

a. Average the data to quarterly means. Produce numerical and graphical summaries of the averaged data,

and comment on your findings and highlight any potential outliers. You might find it useful to convert

the averaged data to a time series object ts(). [10 marks]

b. Fit an ARMA and an ARIMA model to the data. Choose the most appropriate model, and use this to

predict the values for the six 3-month periods from April 2014 to September 2015. [30 marks]

c. Fit a DLM to the data (including both a trend and a seasonal component). Use your model to predict

the values for April 2014 to September 2015. [30 marks]

d. Compare the results of parts b and c, and comment on any differences you may find. [10 marks]

Create an order via https://peakassignments.com/order if you need work on such topic and many more from different disciplines.