ON THE MAXIMIZATION PROBLEM FOR SOLUTIONS OF REACTION–DIFFUSION EQUATIONS WITH RESPECT TO THEIR INITIAL DATA

. We consider in this paper the maximization problem for the quantity (cid:82) Ω u ( t, x )d x with respect to u 0 =: u (0 , · ), where u is the solution of a given reaction diﬀusion equation. This problem is motivated by biological conservation questions. We show the existence of a maximizer and derive optimality conditions through an adjoint problem. We have to face regularity issues since non-smooth initial data could give a better result than smooth ones. We then derive an algorithm enabling to approximate the maximizer and discuss some open problems


Statement of the problem and earlier works
We investigate in this paper the following optimization problem: given T > 0, we want to maximize the functional J T (u 0 ) := Ω u(T, x)dx among all possible initial data u 0 ∈ A m , where and u = u(t, x) is the solution of the reaction-diffusion equation in Ω, ∂u ∂ν (t, x) = 0 for all t ∈ (0, T ), for all x ∈ ∂Ω (1.2) Here, u represents the density of a population, Ω u(t, x)dx is thus the total population at time t.Given an initial total population m, we thus want to place it in such a way that the total population at time T is maximized.
Figure 1.The graphs show the influence of the initial datum.On the left-hand side the initial population u(0, x) is concentrated in a single block of mass m = 2.22 (solid blue line).Also is showed the solution u(T, x) of (1.2) after different times lapses: T = 20 (dashed red line), T = 30 (dotted yellow line); clearly the population tends to disappear as time goes by.On the right-hand side we consider an initial population with the same mass as before but distributed into two blocks slightly separated u(0, x), the resulting density after same given time periods is clearly bigger, in this case the population tends to establish.This is a very natural problem but, as far as we know, it has never been addressed.Let us just mention three papers that investigate similar questions.
In [7], the case of a particular initial datum , with α ≥ 0, has been investigated for a bistable non-linearity f (u) = u(1 − u)(u − ρ), at infinite horizon T = +∞.In that case, when Ω = R, for any given α ≥ 0 it is known from [17] that there exist a critical mass L * (α) > 0 such that for any L < L * (α) the solution goes to 0 and for L > L * (α) it converges to 1.The authors provided numerics [7] showing that one could get L * (α) < L * (0) for α small.This means that for a given initial total population L ∈ L * (α), L * (0) , the initial datum u α 0 associated with two blocks separated by a small gap will converge to 1, while the initial datum u 0 0 associated with a single block will converge to 0 (Fig. 1).This example shows that our present optimization problem could be difficult, since fragmented initial data could give a better total population at time T >> 1.Hence, we expect regularity issues on a possible maximizer.
In [6] a similar problem as the present one is investigated, with a more complex cost, but for a concave non-linearity f , which will latter appear to be quite restrictive in our case (see Sect. 4.1), and with a global control at every time t ∈ (0, T ).First order optimality conditions are heuristically derived, but the authors do not investigate it further in order to determine the optimal control.Lastly, in [15], the authors consider a bistable non-linearity f (t, u) = u(1 − u) u − ρ(t) , and the control is ρ, which is assumed to belong to [0, 1] for all t ≥ 0. The authors prove that with such a control, one could get arbitrarily close from a target function -a travelling wave-considering a sufficiently large time.
Let us also mention [2], where a similar model is investigated.In this paper, a particular bistable non-linearity is considered, and the authors optimize the L 2 distance to 1 at time T for several releases at various times.They prove the existence of an optimizer, compute the first order derivative, and then consider a toy model (with f ≡ 0) and the particular case where u 0 lies in the class of additions of Gaussian-type functions, for which they optimize on the centers of the Gaussian functions numerically.
The main contributions of this paper are the following.First, we show that there exists a maximizer for the functional J T .Second, we establish some optimality conditions for this maximizer arising from the study of the adjoint state.This allows us to provide a numerical algorithm to approximate this optimal distribution in practice.
Before getting into the statement of our results, let us briefly comment on the biological motivations of this work.

Biological motivation
Dengue fever also known as breakbone fever and dandy fever is caused by dengue virus, which is ported and transmitted by mosquitoes of the genus known as Aedes, the two most prominent species being A. aegypti and A. albopictus.Nowadays the progress of this virus is increasing and so the interest of finding a way to control it in absence of an effective medical treatment increases.
Manipulation of the arthropod population by introducing a maternally inherited bacterium called Wolbachia has been catching the attention of biologists in the last years [3,10,11,16].In the infected mosquito this bacterium prevents the development of the virus but also induces a cytoplasmic incompatibility which declines their reproduction rate when the infected population is small but became unimportant once its density becomes sufficiently large [4].
Reaction-diffusion equations have been widely used in order to describe biological phenomena of spreading and species competition, thanks to the works of [4] the dynamic between infected and non-infected mosquitoes population can be described using a Lotka-Volterra system.It has been rigorously shown that it may be studied by mean of a single reaction diffusion equation on the proportion u : R + × Ω → [0, 1] of infected individuals with respect to the total population.
In such models, the reaction term f (u) is such that it reflects the positive correlation between population density and individual fitness, known as Allee effect.In the current problem, this effect is caused by the cytoplasmic incompatibility, so there exists a critical density threshold ρ under which the population of infected mosquitoes declines, but increases for large densities.In fact, we have f (u) < 0 if 0 < u < ρ and f (u) > 0 if ρ < u < 1.Hence, there is a bistable mechanism, either the infected population disappears (i.e.u → 0 when t → ∞, also called extinction), either the whole population get infected after a sufficiently large lapse of time (i.e.u → 1 when t → ∞, also called invasion).
Of important and practical interest is the study of sufficient conditions on the different parameters of the problem in order to reach the invasion state once the deliberately infected mosquito population gets released in the environment.Different approaches to this problem has been done in recent literature, from the biological point of view [8] and also from the mathematical one.In [15], that we already mentioned, it is proposed as a strategy to modify the Allee threshold ρ in order to reach an a priori given target trajectory; in practice this is possible by mean of manipulation of different biological factors which affect directly the mosquito population like increasing or decreasing natural predator's population or affecting carrying capacity of the environment.A similar problem is studied in [5], in fact it is proved that there exists a systematic way to choose a time T > 0, a bounded domain Ω and a distributed control law g(u) supported in Ω, such that for any initial value u 0 the solution of the control problem In practice, the process of infecting mosquitoes manually can be laborious and expensive; so it is usual that institutions has a limited amount of resource and it would be suitable to know which is the best way to use it.If we assume that we have a fixed mass of infected mosquitoes to be released on the environment, it is crucial to find out how to distribute them in order to maximize the effect of this infected founding population after some time T , see for example the works [1,2].

Problem formulation and main result
We will consider in this paper a bounded, smooth, connected domain Ω, and we make the following standard assumptions on the reaction term f Under the above assumptions, the problem (1.2) has a unique solution u(t, x) and it is such that 0 ≤ u(t, x) ≤ 1, so we can define the operator J T : A ⊂ L 1 (Ω) → R in the following way where u is the solution of equation (1.2).We can now formulate our main result, Theorem 2.1.Let Ω be a bounded domain and let f satisfy the hypothesis (H1), (H2) and (H3).Then there exist u 0 ∈ A m such that Moreover, setting u the solution of (1.2) associated with this optimum initial data and p the unique solution of (2.3) then there exists a non-negative real value noted by c such that i) if 0 < u 0 (x) < 1 then p(0, x) = c, ii) if u 0 (x) = 0 then p(0, x) ≤ c, iii) if u 0 (x) = 1 then p(0, x) ≥ c.
The existence of such a maximizer u 0 (x) corresponds with the best possible way to distribute a fixed initial mass m in a bounded domain Ω in order to maximize the total mass at t = T .Any way, the issue of uniqueness is still an open problem.
The second part of the Theorem 2.1 give us some useful information regarding the profile of an optimal initial data; in fact it implies that any optimum can be written as with 0 ≤ γ(x) ≤ 1.In particular if the adjoint state p 0 (x) is not constant in any subset of Ω, then the optimum is u 0 (x) = 1 {p0(x)>c} .In the Section 5 we will see that this result allows us to define a numerical algorithm to approximate a local maximum of J T .

Proof of Theorem 2.1
We first state some results concerning the regularity of u, J and the adjoint state p that we will later invoke.
Lemma 3.1.Under the hypothesis (H1), (H2) and (H3) stated above on f , the solution u = u(t, x) of (1.2) satisfies the following estimates: Proof.The first assertion is a straightforward consequence of the maximum principle and the properties of f .In fact, since 0 ≤ u 0 (x) ≤ 1 and f (0) = f (1) = 0 we have that U = 1 is a super-solution and U = 0 is a sub-solution, so we get the result.
In order to prove now the other two estimates let us multiply the equation (1.2) by u and integrate on Ω, we obtain By Gronwall inequality we get Note that the constant in the right-hand side is independent of t, so we have actually proved that u ∈ L ∞ (0, T ; L 2 (Ω)).Integrating (3.1) in time we get and so it yields Finally, choosing v ∈ H 1 (Ω) such that v H 1 (Ω) ≤ 1 and multiplying and integrating again in (1.2) it holds from which we can deduce that and therefore, thanks to the estimate on the Lemma 3.2.The operator J T defined in (2.1) is differentiable.Furthermore, choosing p as the unique solution of (2.3) it holds for any increment h 0 ∈ L 2 (Ω) such that u 0 + εh 0 remains in the admissible set A m for |ε| small enough.
Moreover, defining h as the solution of the following equation it holds where •, • is the scalar product in L 2 (Ω).
Proof.Let h 0 (x) be defined over Ω such that h 0 ∈ L 2 (Ω) and u 0 + εh 0 is admissible, that is 0 ≤ u 0 + εh 0 ≤ 1 for any |ε| small enough and Ω h 0 (x)dx = 0. Then there exist h ε such that the solution v ε of (1.2) with initial condition v ε (0, x) = u 0 + εh 0 can be written as v ε = u + εh ε , where h ε is the unique solution of (3.10) in Ω, ∂hε ∂ν (t, x) = 0 for all t ∈ (0, T ), for all x ∈ ∂Ω. (3.10) The Gateaux derivative of J T writes then By the Lipschitz continuity of f there exists a positive constant M such that then the Gronwall inequality, applied just as in the proof of the first assertion in Lemma 3.1, implies that ) and after possible another extraction it satisfies that h εn (T, x) h(T, x) weakly in L 2 (Ω).We can then conclude that the Gateaux derivative of J T writes where h is also the unique solution of the differential equation (3.8) obtained by passing in to the limit in the weak formulation of (3.10).
If we show the continuity of this operator u 0 → ∇J T (u 0 ), the differentiability of J T follows.Let h w , h v be the solution to (3.8) for u = w, v respective solutions of (1.2) with w 0 , v 0 as initial conditions, it is then easy to check that Multiplying the equation on h w − h v by h w − h v and integrating on Ω we get the second and third inequalities in (3.15) follows from the regularity of f , the constants C and M are such that |f | ≤ C and |f | ≤ M .By the Gronwall Lemma, for some real L > 0 it holds that Ω δ 2 w,v (t, x)dx ≤ L δ w0,v0 L 2 (Ω) .Together with the Cauchy-Schwartz's inequality and the fact that h w and h v are bounded, this result allows us to get from which we deduce a L 2 -bound to (3.17) Combining (3.17) and (3.14) yields the continuity of u 0 → ∇J T (u 0 ) and hence the differentiability of J T in a larger sense, Multiplying (3.8) by the solution p of (2.3) with u = u and integrating by parts we can rewrite (3.13) as and consequently ∇J T (u 0 ) = p(0, x); this p is often called the adjoint state of h.
Let us now find an expression for the second order derivative of J T .We set v ε = u + εh + ε 2 2 k ε and we write the differential equation satisfied by v ε .
From the regularity hypothesis on f and the estimation (3.17) on h, v ε and u ε it follows that k ε ∈ L ∞ (0, T ; L 2 (Ω)).A passage to the limit when ε → 0 implies, after extraction, the existence of a subsequence . By mean of a Taylor expansion we deduce the differential equation satisfied by k, for all t ∈ (0, T ), for all x ∈ ∂Ω. (3.20) An analysis similar to that yielding (3.19) shows that in this case we multiply (3.20) by p and we integrate in space and time.Finally, we have proved that when ε → 0, it holds which establishes the formula Let Ω be a bounded domain, then the solution p of the equation (2.3) is such that Proof.Let us start by proving that p is bounded.In fact, if we consider p the solution of the ordinary differential equation , which is a consequence of the maximum principle.Since we know explicitly that p = e M (T −t) ∈ L ∞ (0, T ), then it holds Second, we know from the classical L q regularity theory for parabolic equations (see for instance Theorem 9.1, in chapter IV of [13]) that, as 0 ≤ u ≤ 1, one has p t , ∇p, ∇ 2 p ∈ L q loc (0, T ) × Ω for all 1 ≤ q < ∞.Similarly, one has u t , ∇u, ∇ 2 u ∈ L q loc (0, T ) × Ω for all 1 ≤ q < ∞.Next, let us define φ := p t , deriving on (2.3) we obtain that it is the only solution of the equation where G = f (u)u t p + f (u)φ.Due to the previous estimates, one has G ∈ L q loc (0, T ) × Ω for all 1 ≤ q < ∞.Hence, again the L q regularity theory for parabolic equations yields φ t , ∇φ, ∇ 2 φ ∈ L q loc (0, T ) × Ω for all 1 ≤ q < ∞.This means in particular that p t ∈ L q loc (0, T ), W 2,q loc (Ω) for all 1 ≤ q < ∞.Taking q large enough, the Morrey inequality thus yields p ∈ C 0,α loc (0, T ), W 2,q loc (Ω) for all 1 ≤ q < ∞ and α ∈ (0, 1).Now, as p(0, •) ∈ W 2,1 loc (Ω), we know (see for example [9]) that for almost every x ∈ {p(0, •) = c}, one has ∆p(0, x) = 0.Moreover, as p tt ∈ L q loc (0, T ) × Ω for all 1 ≤ q < ∞, one has p t ∈ C 0,α loc (0, T ), L q loc (Ω) and, in particular, p t (0, •) ∈ L q loc (Ω).We eventually derive from (2.3) that for almost every x ∈ {p(0, •) = c}, one has −p t (0, x) = f u(0, x) p(0, x).
Now we proceed with proof of Theorem 2.1.
Proof.This proof falls naturally into two parts, firstly we set the existence of a maximal element and then we characterize it.
Step 1: Existence of a maximal element The basic idea of this part of the proof is to establish the existence of a supremum element in the set {J T (u 0 ) : u 0 ∈ A m } and then to show that it is reached for some element u 0 ∈ A m defined as the limit of a maximizing sequence in A m .
From the first estimate on Lemma 3.1 it follows that J T is bounded.We note also that J T is a continuous operator thanks to the results in Lemma 3.2, in a weak sense this means that ∀ϕ ∈ C ∞ (0, T ) × Ω the following holds Thanks to the first assertion of Lemma 3.1, we can deduce that u n (T, x) ∈ L 2 (Ω) for all n = 1, 2, . . .and consequently the existence of an element u ∈ L 2 (Ω) such that, after extraction, u n (T, x) u in L 2 (Ω), i.e. (3.32) Again, from the second assertion in Lemma 3.1, it follows the existence of a subsequence still noted by ∂ t u n and v ∈ L 2 (0, T ; We can easily prove that ∂ t U = v.In fact, from the weak definition of partial derivative the following equality must holds for all a simple passage to the limit implies the desired result and consequently that U ∈ H 1 (0, T ; L 2 (Ω)), Now choosing ϕ ∈ H 1 (0, T ; L 2 (Ω)) such that ϕ(0, x) = 0 for all x ∈ Ω, after an integration by parts we get and so passing to the limit and integrating by parts again we obtain This equality together with (3.35) implies that u(x) = U (T, x) almost everywhere in Ω. Similarly choosing ϕ adequately we prove that u 0 (x) = U (0, x) almost everywhere in Ω.
Finally, let us define the set The estimates in Lemma 3.1 implies that u n is bounded in W for every element of the subsequence.Moreover, thanks to the Aubin-Lions lemma [14], the set W embeds compactly into L 2 ([0, T ]; L 2 (Ω)) which ensures the existence of a subsequence still noted as u n which is Cauchy in L 2 ([0, T ]; L 2 (Ω)).Then necessarily u n → U strongly in L 2 ([0, T ]; L 2 (Ω)) and thus which follows from the Lipschitz continuity of f and the Cauchy-Schwartz inequality.Now we can pass to the limit in (3.30), gathering (3.28), (3.31) and (3.37) to obtain that ∀ϕ ∈ C ∞ (0, T ) × Ω it holds which means that U is a weak solution to the problem in Ω, ∂U ∂ν (t, x) = 0 for all t ∈ (0, T ), for all x ∈ ∂Ω.
(3.39) Now, we get the following equalities from (3.27) and (3.31) choosing ϕ = 1 which in fact means that u 0 is a maximizing element of J T in A m .
Step 2: Characterization of the maximal element We first prove (i).
Let µ be the Lebesgue measure.We define the set S = {x ∈ Ω : 0 < u 0 (x) < 1} and we suppose that µ(S) = 0, otherwise there exists a set E such that u 0 = I E almost everywhere.We note that S can be written as Let us fix a sufficiently large k such that µ(S k ) > 0 and consider two points x * , y * in this set.For ε ∈ R and r ∈ R + , we define In particular, we can choose r small enough and x * , y * ∈ S k such that µ(B(x * ,r)) µ(B(x * ,r)∩S k ) < 2 and µ(B(y * ,r)) µ(B(y * ,r)∩S k ) < 2 and |ε| < 1 2k as well, then it is clear that 0 < v 0 < 1 and so v 0 is still in S. We note also that We shall now use the fact that u 0 is a maximizing element in A m ; gathering (3.13) and (3.19) we have Here we can multiply the whole equality by then, making r goes to zero we get that for almost every x * , y * ∈ S k it holds 0 = p(0, x * ) − p(0, y * ).
We have finally obtained the existence of a constant c ∈ R such that p(0, x) = c almost everywhere in S k .The same statement holds for every k large enough, so with k → +∞ we have the result for almost every x ∈ S.

Let us now prove (ii).
Lets define the set S 0 = {x ∈ Ω : u 0 (x) = 0} and for every k = 1, 2, . . . the set We assume that µ(S 0 ) > 0, otherwise u 0 > 0 almost everywhere and we pass to (iii).Choosing x * ∈ S 0 k and y * ∈ S k defined as above; r sufficiently small such that µ(B(x * ,r)) and similarly to the previous case Ω v 0 (x)dx = m, so v 0 ∈ A m .Since u 0 is a maximizing element in A m and ε is strictly positive, we get again, we can multiply the inequality by Passing to the limit when r → 0 we get 0 ≥ p(0, x * ) − p(0, y * ), from where c ≥ p(0, x * ) for almost every x * ∈ S 0 k and every k large enough.We have done the proof of (ii) making k → +∞.
Similarly, we can prove (iii).Remark that p ≡ 0 is a sub-solution of equation ( 2.3) and thus necessarily p ≥ 0. Since c is in the range of p, then it must be non-negative as well.This way we end with the proof of the Theorem 2.1.

The u 0 -constant case
We will restrict ourselves in this section to the study of the case where the initial mass m is distributed homogeneously over the bounded domain Ω.We thus consider u 0 := m |Ω| with 0 < m < |Ω|, which is the only constant initial distribution that belongs in A m .In this case the solution of the equation (1.2) is homogeneous in space for every t ∈ [0, T ], meaning that u(t, x) = u(t) for all x ∈ Ω.More precisely u satisfies the ordinary differential equation We also assume that the reaction term f (u) satisfies (H1), (H2), (H3) and the following additional hypothesis (H4) ∃ρ ∈ [0, 1] and δ > 0 such that ∀x ≥ ρ : Proof.As seen previously, the derivative of the target operator J T (u 0 ) on the admissible set A m writes ∇J T (u 0 ), h 0 = Ω h 0 (x)p 0 (x) for every zero mean value function h 0 ∈ L 2 (Ω) and p 0 (x) = p(0, x) being the adjoint state, which is characterized by It is easy to check that p(t, x) = f (u(T )) /f (u(t)), which is also homogeneous in space.Consequently, Let us now to check that, provided that the initial mass is large enough, the second order optimality conditions on this critical point are satisfied.We suppose that then, since f (u 0 ) is positive, u(t) stay increasing in time implying that ρ < u(t) < 1 for every t > 0 and consequently from (H4) we get f (u(t)) < −δ.Besides, from (4.2) follows that p(t) ≥ e −M (T −t) where M is such that f (u(t)) ≥ M, ∀t > 0. Gathering those estimates we obtain As shown in a previous section, h(t, x) satisfies the equation from which we can deduce h(t) Finally, for a certain positive constant C depending only on δ, M and T , it holds that which ensures that the second order optimality conditions on this critical point are fulfilled and concludes the proof of the first assertion.Note that in this case, the constant c derived from the Theorem 2.1 is necessarily c ≡ p 0 otherwise u 0 (x) is either null or totally saturated over the domain Ω which would imply that u 0 / ∈ A m .Hence, the set {p 0 = c} coincides with the whole domain Ω.
Let us now show the second part of the Proposition 4.1.We suppose that u 0 is a local maximizer in the L 2 -norm, then for any sufficiently small perturbation h 0 (x) the second order optimality condition holds, i.e.
In particular, we consider h k 0 (x) = cos(kx), k = 1, 2, . . ., for the sake of simplicity Ω = (0, π) and we assume a diffusion coefficient σ = 1.Then we can explicitly calculate the second order derivative of our target operator J T , which depends on the solution h k (t, x) of the differential equation The solution of (4.8) is explicitly given by and consequently from (3.9) and thanks to the Laplace method it follows that Gathering (4.12) and (4.7) we get that necessarily f (u 0 ) ≤ 0, which completes the prove.

The case of a concave non-linearity
In this section we consider concave non-linearities.We have in mind in particular the well known Fisher-KPP equation where the reaction term f (u) = ru(1 − u) and the system is monostable.We remark the fact that this particular f satisfies (H1)-(H4) so the Proposition 4.1 applies for homogeneously distributed initial data.In what follows we will prove that a constant initial distribution is in fact the optimal distribution.
We start by showing that the functional J T (u 0 ) inherits the concavity from the reaction term.In a general framework we have the following result: Proposition 4.2.Let u be the solution of the differential equation (1.2).If the reaction term f (u) is concave, then the functional J T (u 0 ) defined by (2.1) is also concave.
This concavity property in the Fisher-KPP case ensures that if u 0 is a critical point then it is a maximizer for J T .As straightforward consequence of the Proposition 4.1 we have that u 0 (x) ≡ m |Ω| is a global maximum for J T .Explicitly, the solution writes and in consequence the maximum value of the functional is

Numerical algorithm
The aim of this section is to describe an algorithm to find approximately an optimal distribution provided that the space Ω, the mass m and the time T are prescribed.In order to achieve this goal, the first order optimality conditions (i )-(iii ) on Theorem 2.1 will be crucial.The strategy, which is basically inspired by gradient descent optimization algorithms, will be to find a maximizing sequence u 1 0 , u 2 0 , u 3 0 , . . .which converges to the optimal element u 0 (x).
From now on we shall make the assumption that Ω ⊂ R is an interval.Let us recall that the question we study can be seen as an optimization problem under constraints max u0∈A J T (u 0 ) (5.1) We can then consider the associated problem min where λ ∈ R + is the Lagrangian multiplier.
As already proved, J T is differentiable so L is also differentiable, therefore any critical point must satisfy as in (5.9-5.11):, i.e. .16)This way to define u n+1 0 guarantees the monotonicity of the algorithm.Although this transformation seems to violate the optimality conditions set on Theorem 2.1, once the algorithm converges the limit distribution satisfies it, but this is not a straightforward fact, so we prove it as follows.Claim: If the numerical algorithm described above converges after K iterations, i.e.
then the following statements holds: Proof.From the definition of u n+1 0 trough the convex combination of u n 0 and u and as a consequence of (5.17 , the optimality conditions set on Theorem 2.1 necessarily holds for c = λ n .Relatively less intuitive is the fact that the optimality conditions also holds in the (b) case.Indeed, for every µ ∈ [0, 1] we have ); (5.19) using a Taylor expansion in both sides we get in particular for µ > θ n we obtain ∇J T (u n 0 ), u n+ 1 2 0 − u n 0 ≤ 0. Now we use the explicit formula for the derivative of J T established in Lemma 3.2 (5.21) Together (5.21) and (5.20) imply that ∇J T (u n 0 ), u This mechanism not only improves the convergence but also makes it easy to identify; in fact if at the nthiteration the best θ for the convex combination is θ = 0 then it means that the algorithm has converged, i.e. u n+1 0 = u n 0 and then we can stop iterating.Although the convergence of the algorithm have not been proved, the simulations show good results.In most of the cases convergence occurs after a few iterations and the limit is always an element of the admissible set A m (see Fig. 3).In a few cases the algorithm falls into a quasi-stationary state, in these cases the optimum seems to be very irregular which might be the cause of the slow convergence.For a general picture of the algorithm see Figure 2.

On the issue of symmetry
The fact of choosing a symmetrically distributed density for the initialization of the algorithm strongly induces the symmetric feature over the searching space of solutions.Although this choice can be interpreted as a bias to the search space, it can actually be theoretically justified.
Without lost of generality, consider Ω = (0, a).As the solution satisfies Neumann boundary conditions, any optimal density distribution u 0 defined over Ω is associated with a symmetric distribution u 0s defined over Ω s = (−a, a).Reciprocally, any maximizer v 0 in the class of symmetric initial data on Ω s = (−a, a) induces a solution satisfying a Neumann boundary condition at x = 0. Hence, v 0 restricted to (0, a) is also a maximizer for the problem set on (0, a).Hence, there is a bijection between the maximizers on (0, a) and the maximizers in the class of symmetric functions on (−a, a).

Numerical simulations in the bistable case
For the numerical simulations we have coded the algorithm in a MATLAB routine.At each iteration we solve the differential equations for u n and p n by using a forward Euler's scheme in time and a finite difference approximation of the Laplace term in space.We consider a domain in space Ω = (−50, 50) with dx = 0.1 and a time space t ∈ [0, T ] for a given T and dt chosen such that dt = dx 2 3σ which respect the CFL condition and the stability condition for this scheme.The simulations show that the algorithm described above converges after a few iterations and increase successfully the values of J T (u 0 ) in comparison with the trivial single block distribution (Fig. 3a); we can also observe singularities which are associated with the values verifying p(0, x) = λ; this behavior will be discussed later on Section 6 (Fig. 4).

Possible generalizations
We have considered in this paper the cost function J T (u 0 ) = Ω u(T, x)dx.Other costs are possible, such as, for example, I T (u 0 ) = − Ω 1 − u(T, x) 2 dx, where we put a minus in front of the cost so that we still want to maximize this function.More generally, assume that we want to maximize a cost function where F is Lipschitz-continuous over [0, 1].In this case, the reader could easily verify that our method is still valid, the only change being that the condition at t = T for the adjoint p becomes p(T, x) = F u(T, x) .2) (a).We also show the evolution line of the operator J 50 (u i 0 ) from the first iteration to the last one (b).Note that the limit reached after 18 iterations is an initial data separated in two blocs and shows singularities as a consequence of the definition of the initial solution within the set Ω p,λ 18 .) and its associated value λ 2 mentioned in Theorem 2.1.Note that in this case the set Ω p,λ 2 is not negligible so the associated u 2 0 showed on the right side present singularities arising from the solution of (5.8) within this set.
The reader could also check that Dirichlet or Robin boundary conditions on ∂Ω could also be addressed with our method.The case of unbounded domains is more tedious.If, for example, Ω = R, then a concentrationcompactness theorem should be used when trying to prove the existence of a maximizer u 0 ∈ A m .We leave such a generalization for a possible future work.
Letting T → +∞ Assume that we have as much time as needed, and that we want to optimize the initial datum u 0 in order to promote invasion, that is, convergence to 1.Such a problem is not well-posed, since many initial data should give the convergence to 1 at large time.Hence, the set of maximizing initial data could be quite large.But still a way of reaching it would be useful.
A natural ansatz is the limit of u T 0 with T > 0 if it exists, where u T 0 is a maximizer of J T .Let (u T , p T ) the solutions associated with u T 0 .Consider a limit, up to extraction, u ∞ 0 of u T 0 as T → +∞, for the L ∞ weak star convergence.Let u be the solution on (0, ∞) × Ω associated with u ∞ 0 , which is indeed the limit of u T .Next, define p T := m T p T , where m T is a positive constant chosen so that Ω p T (0, x)dx = 1.We know from Theorem 2.1 that there exists a constant c T such that i) if 0 < u T 0 (x) then p T (0, x) ≥ c T , ii) if u T 0 (x) < 1 then p T (0, x) ≤ c T .Parabolic regularity yields that the solution p T converges in W 1,2 q,loc (0, ∞) × Ω for all q ∈ (1, ∞) as T → +∞ to a solution p of the backward equation Indeed, we know from Proposition 2.7 of [12] that such a solution is unique, up to normalization, which is indeed given here by Ω p(0, x)dx = 1.The following partial characterization of u ∞ 0 follows: i) if 0 < u ∞ 0 (x) then p(0, x) ≥ c, ii) if u ∞ 0 (x) < 1 then p(0, x) ≤ c, where c is indeed the limit of c T .
Of course, such a partial characterization is mostly theoretical, since there is no way of constructing p numerically, except by approximating it as the limit of the functions p T .Note that this adjoint function does not depend on the cost function anymore, which is satisfying since, as we expect convergence to 1 at large time, the shape of the cost function should not play any role.
.26) It must exist, therefore, a supremum element in the set of images of J T , and so a maximizing sequence u n0 in A m , which means lim n J T (u n 0 ) = sup {Am} J T (u 0 ).(3.27) Since 0 ≤ u n 0 (x) ≤ 1, it is clear that u n 0 ∈ L ∞ (Ω)and so after an extraction, we can state that u n 0 * u 0 weakly in L ∞ (Ω), for some u 0 ∈ L ∞ (Ω) i.e.Ω u n 0 (x)ϕ(x)dx → Ω u 0 (x)ϕ(x)dx, ∀ϕ ∈ L 1 (Ω).(3.28)Choosing ϕ = 1 in (3.28), we get that u 0 is still in A m .Now, to each u n 0 , n = 1, 2, . . .we can associate the solution of the problem (1.2) with initial datum u n 0

) it holds that θ n u n+ 1 2 0 2 0
− u n 0 = 0, for every n ≥ K.(5.18)From this equality we deduce that for all n ≥ K one of the following two possibilities must stand, θ n = 0.If (a) stands, then by the definition of u n+ 1

Figure 2 .
Figure 2. Scheme of the numerical algorithm.

Figure 3 .
Figure 3. Considering a fixed mass m = 10, this figure shows the initial data associated with the first and the last iteration of the algorithm and the corresponding solutions of equation (1.2) (a).We also show the evolution line of the operator J 50 (u i 0 ) from the first iteration to the last one (b).Note that the limit reached after 18 iterations is an initial data separated in two blocs and shows singularities as a consequence of the definition of the initial solution within the set Ω p,λ 18 .

Figure 4 .
Figure 4. figure corresponds to the 2nd and also last iteration of the algorithm for a given final time T = 50 and a fixed mass L = 4.At the left-hand side we show the adjoint state p which is the solution of the equation (5.6) and its associated value λ 2 mentioned in Theorem 2.1.Note that in this case the set Ω p,λ 2 is not negligible so the associated u 2 0 showed on the right side present singularities arising from the solution of (5.8) within this set.