Linear regression with a fixed intercept and everything is in log

I have a set of values for a surface (in pixels) that becomes bigger over time (exponentially). The surface consists of cells that divide over time. After doing some modelling, I came up with the following formula:

$$S(t)=S_{initial}2^{t/a_d},$$

where $a_d$ is the age at which the cell divides. $S_{initial}$ is known. I am trying to estimate $a_d$. I simply tried the $\chi^2$ test:

# Range of ages of division. 
a_range = np.linspace(1, 500, 100)

# Set up an empty vector to store the chi squared value
chi_sq = np.zeros(len(a_range))

# Iteration through division ages 
for i in range(len(a_range)):
    # Compute the expected value at each time point. 
    expect = cell_area[0] * (2**(time_range/a_range[i]))

    # Compute chi squared 
    chi_sq[i] = np.sum((cell_area - expect)**2)

# Plot chi squared test
plt.plot(a_range, chi_sq, '.')
plt.yscale('log')

# Labelling
plt.xlabel('division age [min]')
_ = plt.ylabel('$\chi^2$')

but the minimum is always at the upper bound of the age range, depending on what I set this range to be. It doesn't seem right. So I linearized the model:

$$ln(S(t))=ln(S_{initial})+(1/a_d)ln(2)*t,$$

which is now just simple linear regression with a fixed intercept.

Questions:

  1. Why didn't the first method work?
  2. Are there any resources on how to implement the above regression in Python? I'm new to this and everything I found was very simple, but I don't know how to deal with logs and fix the intercept.

Topic chi-square-test structural-equation-modelling linear-algebra python

Category Data Science


You can use sklearn to perform this fit (sklearn.linear_model.LinearRegression)

-> Set fit_intercept=False and use X=t and Y=ln(S(t))-ln(S(0)), the slope (a in Y=aX) should then be equal to (1/ad)ln(2)

About the analytical result : have you taken into account the fact that if cells are dividing on a surface, some cells might not be able to divide anymore as it gets more and more crowded?


Chi squared test does not serve any purpose. The nonlinear equations/functions can be handled by transforming them in linear functions. The linear model can be used once we transform the nonlinear data/relations into linear format. chi squared test checks for variability. You seem to be interested in sum total of surface (area) i.e. linear model and not a linear regression.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.