• No results found

Mathematical Model

A.3.1 Ensure Feasible Routes

Routing constraints

Constraints (A.1) force all vehicles to start the route, and (A.2) force the routes to end ind.

Flow balance is ensured by (A.3), which implies that the routes are continuous between the start and end stations. Each station can only be visited once during the time horizon, with exception of the depot and end destinations which can be visited once by each vehicle.

This is handled in constraints (A.4) and (A.5), respectively. Note that symmetry breaking constraints are intentionally not included in the model, as we force each vehicle to start at unique stations, and all stations only can be visited once within the time horizon.

X

The problem is solved over a fixed time horizonT. To limit shortsightedness, each vehi-cle is allowed to initiate one visit that is scheduled to take place afterT, as illustrated in Figure A.2. This reduces the idle time that may occur at the end of a time horizon shown in Figure A.1.

Figure A.1: Idle time occurs at the end of the time horizon if no station visits are allowed after the time horizon

Figure A.2:One station visit allowed after the time horizon, preventing idle time for the vehi-cles

As discussed in the model assumptions, each station can be visited only once, while the depot can be visited once by each vehicle. The depot has a unique time of visit for each vehicle, while the other stations only have one time of visit. Due to this fact, we have de-fined the termdepot related station. A depot related station is a station that is visited right before or right after the depot, as illustrated in Figure A.3. Further, the time constraints are divided into two groups: Time constraints for depot related stations, and time constraints

for other stations.

Figure A.3: Illustration of a vehicle’s route, with depot related stations highlighted. The depot is labeled with ab

For the stationsnotrelated to depot, the arrival time of a station visitishould be before the arrival time on a later visitjif the service vehicle drives directly fromitoj, enforced by constraints (A.6). These constraints also include the parking, battery swap, bicycle handling and driving time. Constraints (A.7) ensure that the service vehicles do not arrive at the first station before the time it takes to drive there. The time of arrival should be within the time horizon for all visits except for the last, enforced by Constraints (A.8).

Constraints (A.9) make the time of a visit zero if a station is not visited.

ti+TP +TBX

For the depot related stations, constraints (A.10) set the arrival time at the depot for a specific vehicle. Note that instead of using separate variables, tBv, for setting the arrival times at the depot, we could have created a copy of the depot node for each service vehicle.

Constraints (A.11) ensure that the arrival time at the next station after the depot, is after the time of visit to the depot. Further, Constraints (A.12) ensure that the arrival at the depot

is later than the driving time from start, if the depot is a vehicle’s start station. Lastly, the visit time at the depot for a vehicle is set to zero if the vehicle does not visit the depot.

This is enforced in Constraints (A.13).

(ti+TP+TBqBiv+TC(qCCLiv +qF CLiv

+qCCUiv +qF CUiv ) +TibD tBv)xibv 0 i2S, v2V (A.10)

tBv +TP +TbjD tj xbjv0 j2S+\ {b}, v2V (A.11)

tBv Tvo(v) v2V |o(v) =b (A.12)

tBv 0

@1 X

j2S+

xbjv

1

A0 v2V (A.13)

Vehicle loading constraints

Constraints (A.14) restrict that the service vehicle cannot swap more batteries than the amount of available flat bikes after the bicycle rebalancing decisions are performed at a station. Constraints (A.15) ensure that the sum of bikes that are unloaded from a station does not exceed the number of available bicycle slots at the vehicle. Constraints (A.16) set the available battery load of the vehicles on the first station visit to the initial battery inventory of the vehicles. Constraints (A.17) and (A.18) work similarly for charged and flat bikes, respectively. The battery inventory balance for a vehicle after a depot visit is handled in Constraints (A.19) and for all other stations in Constraints (A.20). The vehicle inventory balance for bicycles are handled in Constraints (A.21) and (A.22). Lastly, Constraints (A.23) and (A.24) ensure that there are not performed any bicycle handling in the depot.

qivB lBViv i2SN, v2V (A.14)

qCCLiv +qF CLiv QCv livCV lF Viv +qCCUiv +qivF CU i2S, v2V (A.15)

lBVo(v)v=LBVv v2V (A.16)

lCVo(v)v=LCVv v2V (A.17)

lF Vo(v)v =LF Vv v2V (A.18)

xbjv(lBVjv QBv) = 0 j2S+, v2V (A.19)

xijv(lBVjv lBViv +qivB) = 0 i2S, j2S+, v2V (A.20)

xijv(lCVjv livCV +qivCCU qivCCL) = 0 i2S+\ {d}, j2S+, v2V (A.21)

xijv(lF Vjv livF V +qivF CU qivF CL) = 0 i2S+\ {d}, j2S+, v2V (A.22)

qbvCCL qbvCCU= 0 v2V (A.23)

qbvF CL qbvF CU= 0 v2V (A.24)

Station loading constraints

Constraints (A.25) handle the flat bike inventory balance per station for a visit. Constraints (A.26) handle the same for battery bikes. It should also not be possible to swap more bat-teries than there are flat bikes at a station after rebalancing actions, handled in Constraints (A.27). Constraints (A.28) and (A.29) ensure that the number of bicycles unloaded from a station does not exceed the station inventory for flat and charged bikes, respectively. The total number of loaded bikes to the station is restricted by the number of available slots, secured in (A.30). Constraints (A.31) and (A.32) ensure that all bicycle handling variables are set to zero if the station is not visited by the vehicle. Lastly, Constraints (A.33) work similarly for the number of battery swaps.

In our model,qivF CUis forced to 0 for all stations inSN, implying that one can not deliver flat bikes to non-charging stations. Further, qF CLiv andqBiv is set to 0 for all charging stations, implying that one can not remove flat bikes nor swap batteries at charging stations.

These assumptions are used to prevent unpractical solutions to the model.

liF S=LF Si +IiIF Rti i2S (A.25)

lCSi =LCSi + (IiICR IiOCR)ti+viS viC i2S (A.26)

X

In the model, only necessary variables are created. This is ensured by making the relevant sets as tight as possible. Constraints (A.34) make the decisions of which station to visit next in the route binary. Arches fromi =btoj =dare not instantiated, which make it impossible for the depot to be the last station visit. The number of batteries to change at each station visit and the quantity of bicycles to load or unload should be integer, given by constraints (A.35) to (A.39) . All variables should be non-negative, ensured in constraints (A.40) to (A.51).

qF CLiv 0, integer i2S, v2V (A.39)

ti 0 i2S+ (A.40)

tBv 0 v2V (A.41)

liCS 0 i2S (A.42)

liCS 0 i2S (A.43)

lBViv 0 i2S+, v2V (A.44)

lCViv 0 i2S+, v2V (A.45)

lF Viv 0 i2S+, v2V (A.46)

sCi 0 i2S (A.47)

sFi 0 i2S (A.48)

viS 0 i2S (A.49)

vCi 0 i2S (A.50)

di 0 i2S (A.51)

A.3.2 Objective Function

The objective function has three subobjectives: 1) Minimize total violations, 2) Minimize total deviation, and 3) Maximize reward from initiating trips that exceed the time horizon.

The latter subobjective is done to reduce idle time at the end of the time horizon.

We use the weighted sum method to treat the multi-objective optimization problem (MOP) as a single-objective optimization problem (SOP). The weights WV, WD andWR are weights for the total violations, deviations and reward, respectively. The objective function is written as follows:

minWV ·[Sum of violations] +WD·[Total deviation] WR·[Reward]

Subobjective 1: Violations

The objective of the mathematical model includes minimization of the total violations.

These violations are counted for each station from the beginning of the time horizon until the end of the time horizon,T. This captures the violations across all stations regardless of whether the station was visited and the time of the visit.

The violations occurring before a station visit at stationiis captured by vSi andviC for starvations and congestions, respectively. Recall that a station’s visit may happen within the time horizon, after the time horizon or not at all. This gives three possible situations for each station:

• Situation 1:A station does not get any visits, illustrated in Figure A.4

• Situation 2:A station is visited within the time horizon, illustrated in Figure A.5

• Situation 3:A station is visited after the time horizon, illustrated in Figure A.6 To accurately capture the violations in these three situations, additional variables are needed. First, we define four new violation variables:

viS,f Starvation from visititoT if the visit is beforeT, or total starvation ifiis not visited viC,f Congestion from visititoT if the visit is beforeT, or total congestion ifiis not visited viS,F Starvation fromTto the visit ifiis visited afterT

viC,F Congestion fromTto the visit ifiis visited afterT

Figure A.4:Situation 1: Stationiis not visited

Figure A.5:Situation 2: Stationiis visited within the time horizon

Figure A.6:Situation 3: Stationiis visited after the time horizon

The total violation is then calculated as follows:

X

i2S

viS+viC +X

i2S

⇣viS,f+vC,fi ⌘ X

i2S

⇣vS,Fi +viC,F

In the rest of this section, the constraints that capture these violation variables are intro-duced. To distinguish between the different situations, we introduce two binary variables:

These variables are determined by Constraints (A.52) - (A.55).

ti(1 i)T i2S (A.52)

ti T i i2S (A.53)

iX

v2V

xidv i2S (A.54)

i 1 if stationiis visited after the time horizonT, 0 otherwise

i 1 if stationiis visited, 0 otherwise

i= X

j2S+

X

v2V

xijv i2S (A.55)

Further, Constraints (A.56) - (A.59) ensure non-negativity, and Constraints (A.60), and (A.61) are binary constraints for the newly introduced variables.

viS,f 0 i2S (A.56)

vC,fi 0 i2S (A.57)

viS,F 0 i2S (A.58)

viC,F 0 i2S (A.59)

i2{0,1} i2S (A.60)

i2{0,1} i2S (A.61)

Situation 1:A station is not visited within the time horizon. The total violation is the sum ofvS,fi andviC,f. The station load atT issCi charged bikes andsFi flat bikes. Constraints (A.62) and (A.63) capture the violations for stations that are not visited within the time horizon for charged and flat bikes, respectively.

⇣ sCi +LCSi + (IiICR IiOCR)T+viS,f vC,fi

(1 i) = 0 i2S (A.62)

sFi +LF Si +IiIF RT (1 i) = 0 i2S (A.63) Situation 2: A station is visited within the time horizon. The total violation is the sum ofviS,vCi ,vS,fi amdviC,f. Constraints (A.64) and (A.65) capture the violations from the visit until the end of the time horizon for charged and flat bikes, respectively. Note that these constraints do only apply when i = 1and i = 0, i.e. the station is visitedbefore the time horizon.

( sCi +liCS+X Situation 3: A station is visited after the time horizon. Constraints (A.66) and (A.67) capture the violations from the time horizonT until the visit. These constraints are only realized when a station visit occurs after the time horizon, as they are multiplied by i.

⇣ liCS+sCi + (IiICR IiOCR)(ti T) +vS,Fi

i= 0 i2S (A.66)

lF Si +sFi +IiIF R(ti T) i= 0 i2S (A.67) The total starvations at a station in situation 3, is different from 0 only if the station is actually flat or empty upon visit, i.e. it does not have any battery bikes in its inventory.

Constraints (A.68) ensure this. Similarly, the total congestions is different from 0 only if the station is full upon visit, handled in Constraints (A.69).

lCSi

viS viS,F

= 0 i2S (A.68)

QSi liCS liF S (vCi viC,F) = 0 i2S (A.69) The violations captured inviS,F andvC,Fi will be subtracted from the total violations in the objective function. Hence, the model will aim to maximize these variables. Following from this maximization, we need constraints that ensure that the violationviS,F andviC,F can be greater than zero only if the station visit occurs afterT. This is done by Constraints (A.70) and (A.71).

viS,F(1 i) = 0 i2S (A.70)

vC,Fi (1 i) = 0 i2S (A.71) Lastly, we need to specify the relationship between the violations after the time horizon

and the violations before the station visit occurs. These constraints must be included to ensure that the violations after the time horizon does not take a higher value than the actual violation, ensured in Constraints (A.72) and (A.73).

vS,Fi iviS i2S (A.72)

vC,Fi iviC i2S (A.73)

Subobjective 2: Deviations

The objective function also attempts to minimize the total deviation of the system at the end of each time horizon. The deviation is defined as the absolute value between the ideal state, Oi, and current station inventory of charged bikes,sCi , both measured at T. The absolute value of the deviation is captured by the combination of Constraints (A.74) and (A.75).

di Oi sCi i2S (A.74)

di sCi Oi i2S (A.75)

The total deviation is

X

i2S

di

Subobjective 3: Reward

The effect of visiting a station, in terms of reduction in future violations, is not visible until afterthe station is visited. Similarly, deviation is measured at the time horizon, and will not be affected by later visits. Hence, a vehicle is not incentivized to initiate a station visit after the time horizon if the objective function only considers minimization of violations and deviations within the time horizon. This creates a need to introduce reward for station visits after the time horizon, in order to reduce the idle time of a vehicle at the end of the time horizon. Note that the model restricts the number of station visits after the time horizon to maximum one per vehicle.

As we do not impose a time constraint on this last visit, a challenge arises if the desired station to visit afterT is far away. This is a challenge as it may detain the service vehicle for most of the next time horizon. Consequently, we impose a reward for starting a station visit that takes place afterT, and penalize the driving time to that particular station in the

objective function.

We also want to reward moving flat bikes to charging stations. This is both to reduce the need for battery swaps in the system and to favor flat bikes over charged bikes when un-loading bikes at charging stations. In the objective function we reward the number of bikes unloaded to the charging stations with a weightWF.

There are a number of ways to introduce the reward aspect in the objective functions. In this report, we have chosen to give reward based on the number of batteries the vehicles plan to swap at the station they choose to visit. This reward is denotedriBand is weighted by WB. In addition, a penalty tFv, equal to the driving time to the chosen station, is subtracted from the reward function with weightWT. The complete reward function thus becomes:

Note that this implementation of the reward aspect does not reward depot visits. This is because we only solve the subproblem of the MDP, where the battery inventory at a vehicle atTis not examined. The effect of depot rewards should be investigated in future research.

Further, we introduce parameters, variables and constraints to calculate the correct total reward in the subproblem:

Parameters

WB Weight for battery swap reward WT Weight for penalty on driving time

WF Weight for moving flat bikes to charging stations

Variables

rBi Reward for driving to stationiafter the time horizon

tFv Driving time to the last planned station visit atTfor vehiclev

Constraints

Constraints (A.76) secure that the rewardriBat stationinever takes a higher value than the number of charged bicycles made available at the stationi. Furthermore, Constraints

(A.77) ensure that no reward is given for visits that is not a vehicle’s last station visit.

Finally, the values oftFv are captured in Constraints (A.78).

rBi X

Lastly, non-negativity, integer and binary constraints for the variables are presented in Constraints (A.79) to (A.80).

riB 0 i2S (A.79)

tFv 0 v2V (A.80)

Final expression for the objective function

The objective function is now formulated in Equation (A.81) with the expressions pre-sented in this chapter inserted.

Appendix B

UIP-Inspired Alternative Service