Curve Fitting using MATLAB

Project aims at finding the best fit curve for a given data and finding the error between the original and said curve.

The data being used is that of Specific heat (Cp) vs Temeperature. Plot will be made to show variation of Cp with Temp first. 

Code:

% Data
cpdata = load('data');
temp = cpdata(:,1);
Cp = cpdata(:,2);

% Original Data Points Plot
plot(temp, Cp, 'linewidth', 3, 'b')

% Curve fitting

% Cp = aT + b LINEAR

coeff1 = polyfit(temp, Cp, 1);
predicted_cp1 = polyval(coeff1, temp);

% Cp = aT^2 + bT + c QUADRATIC

coeff2 = polyfit(temp, Cp, 2);
predicted_cp2 = polyval(coeff2, temp);

% Cp = aT^3 + bT^2 + cT + d CUBIC

coeff3 = polyfit(temp, Cp, 3);
predicted_cp3 = polyval(coeff3, temp);

% Compare my curvefit with original data
hold on
plot(temp, predicted_cp1, 'linewidth', 2, 'r')
plot(temp, predicted_cp2, 'linewidth', 2, 'k')
plot(temp, predicted_cp3, 'linewidth', 2, 'm')

xlabel('Temperature (K)')
ylabel('Specific Heat (kJ/kmol-K)')
%legend('Original Curve','Curve Fit quadratic','Cubic','Quartic')
legend('Original Curve', 'Curve Fit linear', 'Curve Fit quadratic', 'Curve Fit cubic')
title('Specific Heat vs Temperature (Curve Fitting)')

% Errors Calculation begins here

% Mean for SSR
meanavg = mean(Cp);

sse = [0, 0, 0]; %Sum of squared errors
ssr = [0, 0, 0]; %Sum of squared regression
for i = 1:length(cpdata)
  sserror1(i) = Cp(i) - predicted_cp1(i); % sum of suard errors = Difference between actual data points and fitting function
  sserror2(i) = Cp(i) - predicted_cp2(i);
  sserror3(i) = Cp(i) - predicted_cp3(i);
  sse = [(sse(1) + (sserror1(i))^2), (sse(2) + (sserror2(i))^2), (sse(3) + (sserror3(i))^2)];
  
  ssregress1(i) = predicted_cp1(i) - meanavg; % sum of suared regression = difference between points of predicted function and mean
  ssregress2(i) = predicted_cp2(i) - meanavg;
  ssregress3(i) = predicted_cp3(i) - meanavg;
  ssr = [(ssr(1) + (ssregress1(i))^2), (ssr(2) + (ssregress2(i))^2), (ssr(3) + (ssregress3(i))^2)];
endfor

sse
ssr
sst = ssr + sse

rsquare = ssr./sst % measure of successful fitting

% Adjusted r square
rsqadj = 1 - (((1-rsquare)*(3200-1))/(3200 - 1 - 1))

%Root mean square errors = square root of sse divided by nuumber of data points available
rmse = (sse/3200).^0.5 %Previously had not squared, was getting 168 RMSEss

Best Fit: Refers to the curve that fits the data and has the least error with respect to the original curve.

Perfect Fit: Perfect fit is the fit with zero error, or as we will calculate later, an R^2 value equal to 1. A perfect fit isn't possible to achieve as there will always be approximations, and some unrealistic points as mentioned in the linked article: Curve Fitting; best fit and perfect fit. The higher the degree of the polynomial, the higher the R^2 value but greater is the approximation, which is often times shown as a warning in the command window. Basically, the polynomial is sensitive to change.

The goal here is to check goodness of the fit and assess the error.

  1. First the given Cp vs temp data is stoed in a 3200x2 array.
  2. Then plot is done for Cp vs Temperatore from the given points, as "original curve".
  3. Then, polyfit is used to derive the coefficients of the polynomial which is the used to calculate the predicted Cp value using polyval. This is done for linear, quadratic and cubic fits.
  4. The resepctive predicted Cp values are then plotted against Temp on the same graph.

Error calculation:

  1. First, we take 2 row vectors with 3 values. One stored as sse (sum of squared errors) which is the square of the difference between original data and fit curve, and other as ssr (sum of squared regression) which is difference of fit curve from the mean of original data.
  2. 3 values in the row vector are to store errors for linear, quadratic and cubic vectors.
  3. In a for loop from 1 to length of the cp data (Which is 3200), we calculate sse and ssr in every iteration and store it in the respective arrays.
  4. sse, ssr and sst, which is the sum of sse and ssr, are returned.
  5. R^2 is the measure of goodness of the fit, which is calculated as "ssr/sst". This ranges from 0 to 1, with 1 indicating perfect fit.
  6. Additionally, Adjusted R^2 and RMSE (root mean square error) are calculated, which do the adjustment to account for independent variables.

Output:

sse =

   2163049.08918    540107.56288     94272.02052

ssr =

   26630624.16146   28253565.68776   28699401.23012

sst =

   28793673.25064   28793673.25064   28793673.25064

rsquare =

   0.92488   0.98124   0.99673

rsqadj =

   0.92485   0.98124   0.99672

rmse =

   25.9991   12.9917    5.4277

R^2 values for various fits:

  1. Linear: 0.92488
  2. Quadratic: 0.98124
  3. Cubic: 0.99673

Ideally, increasing the degree of the polynomial increases the goodness of the fit and gets closer and closer to being the best fit. This is demonstrated above, where cubic shows highest R^2 value amongst the 3, and we know that R^2 value is the means to judge goodness of the fit.

However, we also know that we get warnings when we increase the degree to above 2. This is because of the approximations that the solever makes to fit the curve. 

We can atempt to solve the approximations using the following 2 methods, by my research (still on-going):

  1. Using structures to store the coefficients rather than as an array. This allows us to reference the memory block while calculating the predicted Cp value, and can "supposedly" reduce approximations. However, I haven't been able to get this working yet.
  2. Work around/add new dependent variables to make the curve more precise. This would mean playing around with the adjusted R^2 value. This too is a work in progress.

Plot:

Final Plot

Errors:

First attempt at using SSE as an array:

SSE array mistake

Added dot operator for array operations. Problem Solved.

Polyfit array experiment: Attempting to workout structures by first storing polyfit coefficients in separate variables and calling them

polyfit experiment

--Work in progress; for the moment, elimiated it and stuck to standard syntax--

Other References:

  1. Polynomials
  2. Buckingham Pi Theorem

Projects by Arjun Bhat

This project explores the simulation of the motion of a pendulum for the equation of damped vibration of a pendulum. Below is the equation (taken from "Challenges" page): The above equation is solved using ODE function, and then odeint is used to get an ar Read more

Project involves plotting an Otto Cycle graph in MATLAB The goal is to plot all strokes for a petrol operating by the otto cycle from intake, compression, power to exhaust, including both adiabatic curves. Description: The Otto Cycle is the idealised cycle that descr Read more

Objective: The project involves simulating a 2-joint robotic arm.   Code and Description: import math import matplotlib.pyplot as plt import numpy as nm # Lengths l1 = 1 l2 = 0.5 # Angles ang1 = nm.radians(nm.linspace(0, 90, 19)) ang2 = nm.radians Read more

Flow over bicycle in Python
Arjun Bhat · 2019-11-18 16:09:30

Objective: To calculate the drag force for various drag coefficient and velocity values for a certain geometry.   Introduction: Drag force is the force exerted by the air as the vehicle moves through it. The force exerted is directly proportional to the square Read more

Aim: To parse the NASA Thermodynamc Properties data file, calculate thermodynamic properties and plot the characteristics. Introduction: NASA's thermodynamic properties file consists of multiple species of gases, their respective operating temperature ranges, and Read more

The project aims at optimization of a stalagmite function and finding the global maxima. For this, we will be using the concept of genetic algorithm. Following are the steps used to optimise the function: Code the function given in a separate function code. This is Read more

This project explores the simulation of the motion of a pendulum for the equation of damped vibration of a pendulum. Below is the equation (taken from "Challenges" page): An attempt is made to solve the above equation, and then call it using ode45 to get an array of Read more

Project involves plotting an Otto Cycle graph in MATLAB The goal is to plot all strokes for a petrol operating by the otto cycle from intake, compression, power to exhaust, including both adiabatic curves. Description: The Otto Cycle is the idealised cycle that descr Read more

Code and Description: The project involves simulating a 2-joint robotic arm. % Lengths l1 = 1; l2 = 0.5; % Angles ang1 = linspace (0, 90, 20); ang2 = linspace (0, 90, 20); count = 1; % loop for i = 1:length(ang1) ANG1 = ang1(i); for j = 1:length(ang2 Read more


Loading...

The End