Monday, August 21, 2023

Codeforces Round #892, Problem C

Introduction 


In Another Permutation Problem, given a number nthe goal is to maximize an objective function over the set P(n) of all permutations of [1, n]:


max { cost(p), p in P(n) }

where


cost(p) = Sum(i * p(i), 1 <= i <= n) - Max(i * p(i), 1 <= i <= n)


Approaches


Brute force


Here are the max costs and associated optimal permutation for first n values


n cost p
2 2 [2, 1]
3 7 [1, 3, 2]
4 17 [1, 2, 4, 3]
5 35 [1, 2, 5, 4, 3]
6 62 [1, 2, 3, 6, 5, 4]
7 100 [1, 2, 3, 4, 7, 6, 5]
8 152 [1, 2, 3, 4, 8, 7, 6, 5]
9 219 [1, 2, 3, 4, 5, 9, 8, 7, 6]
10 303 [1, 2, 3, 4, 5, 6, 10, 9, 8, 7]
11 406 [1, 2, 3, 4, 5, 6, 7, 11, 10, 9, 8]


It took 17 sec to return the solution for n=11.


Since permutation set size is n!, enumerating all the states via backtracking does not scale as n value increases.


But it gives an intuition of the max solution pattern. From the data, the optimal solution for n depends on a 2nd index k, 1 <= k <= n. It has the following shape:


p = [1, 2, ..., k-1, n, n-1, ..., k]


Simplify the problem


To solve the original problem, we break it down into sub-problems that are simpler to solve.

The Max value M of the elements in the Sum / Max functions is fixed:


M = Max(i * p(i), 1 <= i <= n)


We restrict the set of permutations that are candidates for the optimization, from P(n) to P(n, M), to comply with the constraint M.


max { cost(p), p in P(n, M) }


For p in P(n, M), the new cost to maximize is now a linear combination minus a constant


cost(p) = Sum(i * p(i), 1 <= i <= n) - M


Greedy


Greedy approach efficiently computes the maximal solution of the subproblem. It assigns values starting at n down to 1. At each step i, n >= i >= 1, the rule is to assign the highest available value such that i * p(i) <= M.


The subproblem complexity is O(n log(n)). We use an ordered tree as the data structure. For each iterations, we query the DS and assign the returned value to the permutation current index. Finally we remove the allocated value from the DS.


Visualization


There is no need to solve all sub-problems for all values in the range [n, ..., n * n]. The candidates can be found by generating products i * j, 1 <= i <= n, 1 <= j <= n.


Here's the symmetric matrix of all possible product values, for n = 10:


[[ 1 2 3 4 5 6 7 8 9 10]
[ 2 4 6 8 10 12 14 16 18 20]
[ 3 6 9 12 15 18 21 24 27 30]
[ 4 8 12 16 20 24 28 32 36 40]
[ 5 10 15 20 25 30 35 40 45 50]
[ 6 12 18 24 30 36 42 48 54 60]
[ 7 14 21 28 35 42 49 56 63 70]
[ 8 16 24 32 40 48 56 64 72 80]
[ 9 18 27 36 45 54 63 72 81 90]
[ 10 20 30 40 50 60 70 80 90 100]]

The red lines constitute the wireframe of the surface. They are linear functions (straight lines) for each constant index i, 1 <= i <= n:

j: i * j

j: 1 * j
j: 2 * j
...
j: n * j


The blue lines are the "reverse" diagonals in the matrix. They are parabolic functions (curved lines), for each index i, 1<= i <= n:


j: (i+1-j) * j


j: (2 - j) * j
j: (3 - j) * j
...
j: (11 - j) * j


The list of unique product values from the matrix is


[ 1 2 3 4 5 6 7 8 9 10 12 14 15 16 18 20 21 24
25 27 28 30 32 35 36 40 42 45 48 49 50 54 56 60 63 64
70 72 80 81 90 100]


hence enumerating each subproblems we need to solve.


Operation Research


Finding the right permutation can be considered as an assignment problem: each index i should be assigned to a value j. The assignment should maximize the cost function while respecting the constraints. If i * j > M, remove associated edge in the bipartite graph.



3 important classes of OR problems, from most generic to most specific are


Linear Sum Assignment


This OR-tools guide describes how to solve an assignment problem to minimize the sum of the weights of the edges used in the perfect matching. 2 solvers are available


Integer Programming


The assignment problem can be formulated as a general Mixed-Integer Programming problem. Permutation p can be represented as a n*n matrix with n ones, having only 1 one per row and per column.

An example for n=3 is

[[0 1 0]
[1 0 0]
[0 0 1]]

The permutation vector can be obtained by multiplying the permutation matrix with the index vector [1, ..., n].

We can create a boolean 2d array d[i][j] for each elements in the matrix. The solver will find an optimal solution for p

p[i] = Sum(d[i][j] * j, 1 <= j <= n)

The MIP problem formulation is

Maximize

Sum(i * j * d[i][j], 1 <= i <= n, 1 <= j <= n)

With following constraints

For 1 <= i <= n, 1 <= j <= n,

d[i][j] in {0, 1}

For 1 <= j <= n,
Sum(d[i][j], 1 <= i <= n) = 1

For 1 <= i <= n,
Sum(d[i][j], 1 <= j <= n) = 1

For 1 <= i <= n,
Sum(i * j * d[i][j], 1 <= j <= n) <= M

Monday, October 3, 2022

Codechef Starters 57, Tree and Divisors

Introduction


In TREEDIVS problem from Codechef Starters 57 contest, you need to output a node statistics, for each nodes in the tree.

A node v is associated to an integer value A_v.


The node statistics to compute is the number of divisors of the product of the values for each nodes within its subtree.

For v a node in the tree,


Statistics(v) = DivisorCount(TreeProduct(v))


TreeProduct(v) = Product_{u in Tree(v)} A_u


Counting divisors of a number requires its prime factorization.



While my logics was correct, I got Time Limit Exceeded on my submission on a few test cases.





In this article, we will go over multiple optimization strategies to reduce runtime within the time limit.


Refer for the successful submission from another community member from which I got the hints for the optimizations. See also Codechef editorial & Small-to-Large merging USACO article.


The performance for 4 different versions run against 2 different types of inputs are:



Runtime got slightly worst on random input. On slim input, it went down drastically.


Setup


Input format


The input consists of

  1. N, the tree size
  2. A, the list of node value
  3. (u_i, v_i), the list of edges to build adjacency list


3
4
100 101 102 103
1 2
1 3
1 4
4
2 2 2 2
1 2
2 3
3 4
5
43 525 524 12 289
1 2
1 3
3 4
4 5


Expected output


We output the list of node statistics for each node:


192 2 8 2 
5 4 3 2
1080 12 60 18 3


Input Parsing


package codechef.starters57.TREEDIVS;

import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PrintWriter;

public class TreeAndDivisorsMain {
public static void main(String[] args) throws IOException {
TreeAndDivisorsFactory factory = TreeAndDivisors3::new;

//InputStream inputStream = System.in;
InputStream inputStream = new FileInputStream("TREEDIVS");
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
PrintWriter writer = new PrintWriter(new BufferedOutputStream(System.out));

String[] tokens;

tokens = bufferedReader.readLine().split(" ");
int T = Integer.parseInt(tokens[0]);
while (T > 0) {
tokens = bufferedReader.readLine().split(" ");
int N = Integer.parseInt(tokens[0]);
tokens = bufferedReader.readLine().split(" ");

int[] A = new int[N];
for (int i = 0; i < N; i++) {
A[i] = Integer.parseInt(tokens[i]);
}

TreeAndDivisors treeAndDivisors = factory.createTreeAndDivisors(N, A);
for (int i = 0; i < N-1; i++) {
tokens = bufferedReader.readLine().split(" ");
int u = Integer.parseInt(tokens[0]);
int v = Integer.parseInt(tokens[1]);
treeAndDivisors.addEdge(u-1, v-1);
}

int[] divisorCount = treeAndDivisors.divisors();
String divisorLine = AbstractTreeAndDivisors.listToString(divisorCount);
writer.println(divisorLine);

T--;
}

writer.close();
inputStream.close();
}
}


Load Test


We define an interface for the solution contract:


public interface TreeAndDivisors {
void addEdge(int u, int v);
int[] divisors();
}


We define a factory to decouple running a solution on input data from the implementation itself.


public interface TreeAndDivisorsFactory {
TreeAndDivisors createTreeAndDivisors(int n, int[] A);
}


We define a tree generator to compare the effect of the tree structure shape on runtime:


public interface TreeGenerator {
void generate(List<List<Integer>> adjacency);
}


Unit Test


Here's JUnit 5 parameterized unit test :


@ParameterizedTest
@MethodSource
public void testSolution(int[] A, int[][] edges, int[] expectedDivisors) {
TreeAndDivisorsFactory factory = TreeAndDivisors3::new;
TreeAndDivisors treeAndDivisors = factory.createTreeAndDivisors(A.length, A);
Arrays.stream(edges)
.forEach(edge -> treeAndDivisors.addEdge(edge[0], edge[1]));
Assertions.assertArrayEquals(expectedDivisors, treeAndDivisors.divisors());
}

static Object[][] testSolution() {
return new Object[][] {
new Object[] {
new int[] { 100, 101, 102, 103 },
new int[][] { { 0, 1 }, { 0, 2 }, { 0, 3 } },
new int[] { 192, 2, 8, 2 } }
};
}


Input Generation


We define 3 input generators, for a random distribution and 2 extreme cases.

1. With setRandomAdjacency, the node distribution is random. For each node i, 0 < i < N, its parent is selected at random in [0, ..., i-1].


2. With setSlimAdjacency, the tree degenerates in a list.




3. With setThickAdjacency, a single parent contains all the nodes as direct children.




static void setThickAdjacency(List<List<Integer>> adjacency) {
for (int i = 1; i < adjacency.size(); i++) {
adjacency.get(0).add(i);
}
}

static void setSlimAdjacency(List<List<Integer>> adjacency) {
for (int i = 1; i < adjacency.size(); i++) {
adjacency.get(i-1).add(i);
}
}

static void setRandomAdjacency(List<List<Integer>> adjacency) {
for (int i = 1; i < adjacency.size(); i++) {
int parent = random.nextInt(i);
adjacency.get(parent).add(i);
}
}



Analysis


Recursion enables computing the parent value from its children values. A top down approach via Depth First Search traversal will be able to compute all node statistics. The traversal runs iteratively through a FIFO queue data structure, instead of recursively through a call stack, to avoid potential StackOverflow exception.


Like in this other problem, the DFS traversal is abstracted away to focus only on the custom logics. This constructs separate the traversal logics from the update logics, which is located in updateNode.

private void dfs() {
Traversal traversal = new AdjacencyListDFSTraversal(adjacency);
traversal.traverse(this::updateNode);
}

protected abstract void updateNode(int current, int parent, List<Integer> children);

Data structure: Hash Map


When computing a current node product value, we need to keep track of the prime factorization of the product value for each child node.

Each prime factorization is stored as a Map<Integer,Integer> where

  • keys are prime numbers
  • values are associated exponents


Version 1

Computing the parent hash map requires 1 merge operation per child.
private void updateNode(int current, int parent, List<Integer> children) {
Map<Integer, Integer> parentExponents = primeFactors(A[current]);

children.stream()
.filter(child -> !(child == parent))
.map(child -> primeExponents[child])
.forEach(childExponents -> mergeExponents(parentExponents, childExponents));

primeExponents[current] = parentExponents;
divisorCount[current] = divisorCount(parentExponents);
}

In the slim case where the tree is a long list,

  • node n at the bottom requires 0 merges
  • node n-1 requires 1 merges
  • ...
  • node 1 at the top requires n-1 merges

This results in O(N^2) total merges.

However we get an exponential trend in the runtime as N grows:



Then we start getting memory error as we increase N. It is thrown when increasing hash table size in HashMap resize.


Caused by: java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.HashMap.resize(HashMap.java:702)
at java.base/java.util.HashMap.merge(HashMap.java:1363)
at codechef.starters57.TREEDIVS.AbstractTreeAndDivisors.mergeExponents(AbstractTreeAndDivisors.java:85)
at codechef.starters57.TREEDIVS.TreeAndDivisors1.lambda$updateNode$2(TreeAndDivisors1.java:19)
at codechef.starters57.TREEDIVS.TreeAndDivisors1$$Lambda$472/0x0000000800d54248.accept(Unknown Source)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
at codechef.starters57.TREEDIVS.TreeAndDivisors1.updateNode(TreeAndDivisors1.java:19)
at codechef.starters57.TREEDIVS.AbstractTreeAndDivisors$$Lambda$466/0x0000000800d4efa0.visit(Unknown Source)
at graph.tree.traversal.AbstractDFSTraversal.traverse(AbstractDFSTraversal.java:63)
at graph.tree.traversal.AbstractDFSTraversal.traverse(AbstractDFSTraversal.java:36)
at codechef.starters57.TREEDIVS.AbstractTreeAndDivisors.dfs(AbstractTreeAndDivisors.java:78)
at codechef.starters57.TREEDIVS.AbstractTreeAndDivisors.divisors(AbstractTreeAndDivisors.java:67)
at codechef.starters57.TREEDIVS.TreeAndDivisorsTest.runInput(TreeAndDivisorsTest.java:89)
at codechef.starters57.TREEDIVS.TreeAndDivisorsTest.runBatch(TreeAndDivisorsTest.java:68)
at codechef.starters57.TREEDIVS.TreeAndDivisorsTest.run(TreeAndDivisorsTest.java:24)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)


The original approach had a memory leak, keeping track of every intermediate hash maps, while only the current and children ones were required.

To reduce memory footprint, we should null out children Map references once we no longer need them. It will mark the objects as candidates for Garbage Collection.


@Override
protected void updateNode(int current, int parent, List<Integer> children) {
Map<Integer, Integer> parentExponents = primeFactors(A[current]);

children.stream()
.filter(child -> !(child == parent))
.forEach(child -> {
mergeExponents(parentExponents, primeExponents[child]);
primeExponents[child] = null;
});

primeExponents[current] = parentExponents;
divisorCount[current] = divisorCount(parentExponents);
}


We can now see the O(N^2) trend:



Versions 2 & 3


We can skip a large number of merges by reusing a child object. Selecting the child with the larger size is optimal. Then just reuse the same object to assign it to the current node.

private void updateNode(int current, int parent, List<Integer> children) {
Map<Integer, Integer> valueExponents = primeFactors(A[current]);

Optional<Integer> maxOptional = children.stream()
.filter(child -> child != parent)
.max(Comparator.comparing(child -> primeExponents[child].size()));

Map<Integer, Integer> parentExponents;

if (maxOptional.isPresent()) {
int maxChild = maxOptional.get();

parentExponents = primeExponents[maxChild];
mergeExponents(parentExponents, valueExponents);
children.stream()
.filter(child -> !(child == parent || child == maxChild))
.map(child -> primeExponents[child])
.forEach(childExponents -> mergeExponents(parentExponents, childExponents));
} else {
parentExponents = valueExponents;
}

primeExponents[current] = parentExponents;
divisorCount[current] = divisorCount(parentExponents);
}



Version 4


Computing the results require 2 operations

  1. merge prime factor exponents
  2. compute divisor count

Instead of doing them separately, combining them in 1 single pass reduces the number of iterations through the hash map keys by 50%.

We switch to mergeMultiplyExponents instead of mergeExponents to now perform both actions at once.


private void updateNode(int current, int parent, List<Integer> children) {
Map<Integer, Integer> valueExponents = primeFactors(A[current]);

Optional<Integer> maxOptional = children.stream()
.filter(child -> child != parent)
.max(Comparator.comparing(child -> primeExponents[child].size()));

Map<Integer, Integer> parentExponents;
int dc;

if (maxOptional.isPresent()) {
int maxChild = maxOptional.get();

parentExponents = primeExponents[maxChild];
dc = divisorCount[maxChild];

dc = mergeMultiplyExponents(dc, parentExponents, valueExponents);

for (int child: children) {
if (child == parent || child == maxChild) {
continue;
}
dc = mergeMultiplyExponents(dc, parentExponents, primeExponents[child]);
}
} else {
parentExponents = valueExponents;
dc = divisorCount(valueExponents);
}

primeExponents[current] = parentExponents;
divisorCount[current] = dc;
}


Gradle Report


See this commit for the full application code.

We compare multiple versions

  1. Original
  2. Null out children maps
  3. Reuse max child Map
  4. 50% less Map scans

 against 2 inputs

  • random
  • slim

For slim input with N=4000 nodes, the runtimes were

  1. [27.915s] Original version
  2. [17.442s] Null out children maps
  3. [4.536s] Reuse max child Map
  4. [0.312s] Reuse Map scan for both merging and divisor count computation

    For random input with N=30000 nodes, the runtimes were

      1. [3.562s] Original version
      2. [3.324s] Null out children maps
      3. [3.368s] Reuse max child Map
      4. [3.881s] Reuse Map scan for both merging and divisor count computation





      Conclusion


      As we merge hash tables bottom-up towards the root, hash table grows significantly. Iterating through the entries is the bottleneck in the code execution. We should optimize towards reducing full hash table scans as much as possible.

      Wednesday, September 7, 2022

      Meta Hacker Cup 2022, Qualification Round, Problem D, Second Flight

      Introduction


      In Problem D, we need to return the max number of passengers who can travel from airport X to airport Y, using either

      1. a direct flight, available in the morning & the evening
      2. a 2-leg flight, going through 1 intermediate airport

      Graph contains N nodes and M edges. We need to run Q queries overall.


      Interface


      The end-goal is to implement following interface within the problem constraints,


      interface SecondFlightService {
      void addFlight(int x, int y, int capacity);
      void setConnectionCapacity();
      long passengers(int x, int y);
      }

      • addFlight is called for each edge to add it
      • setConnectionCapacity is called once after all edges were added
      • passengers is called to process the query

      Input Parsing


      package hackercup2022;

      import java.io.BufferedOutputStream;
      import java.io.BufferedReader;
      import java.io.FileInputStream;
      import java.io.FileOutputStream;
      import java.io.InputStream;
      import java.io.InputStreamReader;
      import java.io.OutputStream;
      import java.io.PrintWriter;
      import java.util.ArrayList;
      import java.util.List;

      public class SecondFlight implements SecondFlightService {

      public static void main(String[] args) throws Exception {
      //InputStream inputStream = System.in;
      InputStream inputStream = new FileInputStream("second_flight_sample_input.txt");
      BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));

      //OutputStream outputStream = System.out;
      OutputStream outputStream = new FileOutputStream("second_flight_sample_output.txt");
      PrintWriter writer = new PrintWriter(new BufferedOutputStream(outputStream));

      String[] tokens;

      tokens = bufferedReader.readLine().split(" ");
      int T = Integer.parseInt(tokens[0]);
      for (int t = 1; t <= T; t++) {
      tokens = bufferedReader.readLine().split(" ");
      int N = Integer.parseInt(tokens[0]);
      int M = Integer.parseInt(tokens[1]);
      int Q = Integer.parseInt(tokens[2]);
      int[] A = new int[M];
      int[] B = new int[M];
      int[] C = new int[M];
      SecondFlightService secondFlight = new SecondFlight(N);
      for (int i = 0; i < M; i++) {
      tokens = bufferedReader.readLine().split(" ");
      A[i] = Integer.parseInt(tokens[0]);
      B[i] = Integer.parseInt(tokens[1]);
      C[i] = Integer.parseInt(tokens[2]);
      secondFlight.addFlight(A[i], B[i], C[i]);
      }

      secondFlight.setConnectionCapacity();

      writer.print(String.format("Case #%d:", t));
      for (int i = 0; i < Q; i++) {
      tokens = bufferedReader.readLine().split(" ");
      int x = Integer.parseInt(tokens[0]);
      int y = Integer.parseInt(tokens[1]);
      writer.print(" " + secondFlight.passengers(x, y));
      }
      writer.println();
      }

      writer.close();
      outputStream.close();
      inputStream.close();
      }
      }


      Solution with adjacency matrix


      We iterate over the list of nodes. We maintain 2 2D arrays, capacity vs connectionCapacity, to maintain the counts for direct vs indirect flights.


      public class SecondFlight {


      /**
      * number of airports
      */
      int N;

      public SecondFlight(int N) {
      this.N = N;
      capacity = new int[N+1][N+1];
      connectionCapacity = new int[N+1][N+1];
      }

      int[][] capacity;
      int[][] connectionCapacity;
      public void addFlight(int a, int b, int c) {
      capacity[a][b] = c;
      capacity[b][a] = c;
      }

      public void setConnectionCapacity() {
      for (int x = 1; x <= N; x++) {
      for (int y = 1; y <= N; y++) {
      if (capacity[x][y] == 0) {
      continue;
      }
      for (int z = 1; z <= N; z++) {
      if (capacity[y][z] == 0 || x == z) {
      continue;
      }

      connectionCapacity[x][z] += Math.min(capacity[x][y], capacity[y][z]);
      connectionCapacity[z][x] = connectionCapacity[z][x];
      }
      }
      }
      }

      public int passengers(int x, int y) {
      return 2 * capacity[x][y] + connectionCapacity[x][y];
      }
      }


      This approach runs in O(N^3) runtime complexity, O(N^2) space complexity.

      We cache all possible values. Each 2-leg flight [x, y, z] contribute to the count for associated (x, z) pair.

      The matrix is symmetrical since we use an undirected graph. A flight going in either direction [x,y] or [y,x] has the same capacity.


      Caching all values


      We now switch from adjacency matrix to adjacency list.

      We use directed edges to represent the graph.


      static class Edge {
      int a, b, capacity;
      Edge(int a, int b, int capacity) {
      this.a = a;
      this.b = b;
      this.capacity = capacity;
      }
      }


      We iterate over List<Edge> the list of edges, instead of nodes.

      We lookup the list of the airports connected to any airport in List<List<Edge> adjacency list.


      public class SecondFlight {


      /**
      * number of airports
      */
      int N;

      int[][] capacity;


      List<Edge> edges = new ArrayList<>();
      List<List<Edge>> adjacency = new ArrayList<>();

      public SecondFlight(int N) {
      this.N = N;
      capacity = new int[N+1][N+1];

      for (int i = 0; i <= N; i++) {
      adjacency.add(new ArrayList<>());
      }
      }

      public void addFlight(int a, int b, int c) {
      addEdge(new Edge(a, b, c));
      addEdge(new Edge(b, a, c));
      }

      private void addEdge(Edge edge) {
      edges.add(edge);
      adjacency.get(edge.a).add(edge);
      capacity[edge.a][edge.b] = 2*edge.capacity;
      }


      public void setConnectionCapacity() {
      edges.forEach(this::addSecondLeg);
      }

      private void addSecondLeg(Edge leg1) {
      for (Edge leg2 : adjacency.get(leg1.b)) {
      if (leg2.b != leg1.a) {
      capacity[leg1.a][leg2.b] += Math.min(leg1.capacity, leg2.capacity);
      }
      }
      }

      public int passengers(int x, int y) {
      return capacity[x][y];
      }
      }


      What's the new runtime complexity?

      For each edge leg1, we iterate over all the neighbors of the destination airport of the flight represented by leg1. If we don't end up with a round trip, we update the count associated to the considered 2-leg flight, [leg1.a, leg1.b, leg2.b]

      Each edge incoming to a node v can be used as leg1, each edge outgoing from v can be used as leg2, as long as it's not the reverse of leg1.

      Since this is undirected graph, we have

      deg_in(v) = deg_out(v) = deg(v)

      so the overall complexity is

      Sum(deg(v)^2) <= (Sum(deg(v))^2 = M^2

      To summarize,

      • runtime complexity is O(M^2)
      • space complexity is O(N^2)

      which won't work with problem constraint

      1 <= M,N <= 200 000


      Computing the query


      Computing the query instead of precomputing it does not require the 2-D array:


      public class SecondFlight implements SecondFlightService {


      /**
      * number of airports
      */
      int N;

      static class Edge {
      int a, b, capacity;
      Edge(int a, int b, int capacity) {
      this.a = a;
      this.b = b;
      this.capacity = capacity;
      }

      @Override
      public String toString() {
      return String.format("(%d->%d) %d", a, b, capacity);
      }
      }
      List<Map<Integer, Edge>> adjacency = new ArrayList<>();

      public SecondFlight(int N) {
      this.N = N;
      for (int i = 0; i <= N; i++) {
      adjacency.add(new HashMap<>());
      }
      }

      public void addFlight(int a, int b, int c) {
      addEdge(new Edge(a, b, c));
      addEdge(new Edge(b, a, c));
      }

      private void addEdge(Edge edge) {
      adjacency.get(edge.a).put(edge.b, edge);
      }

      public void setConnectionCapacity() {
      }

      public long passengers(int x, int y) {
      long p = 0;
      for (Edge leg1: adjacency.get(x).values()) {
      if (leg1.b == y) {
      p += 2 * leg1.capacity;
      continue;
      }

      Edge leg2 = adjacency.get(leg1.b).get(y);
      if (leg2 == null) {
      continue;
      }

      p += Math.min(leg1.capacity, leg2.capacity);
      }
      return p;
      }
      }


      We switched from List<Edge> to  Map<Integer, Edge> in adjacency list.

      The complexity of calculating the result is O(deg(x)) which is O(M).


      Optimizations


      Optimizations outlined in the editorial are


      Caching query results


      We cache the results in a 2D map, Map<Integer, Map<Integer, Long>>.

      Selecting the node with smaller degree


      We check adjacency list size to pick start node with fewer neighbors.

      Solution


      The new version for the query logics is

      Map<Integer, Map<Integer, Long>> results = new HashMap<>();

      public long passengers(int x, int y) {
      if (adjacency.get(x).size() > adjacency.get(y).size()) {
      return passengers(y, x);
      }
      // assert x has fewer neighbors than y

      return results.computeIfAbsent(x, x2 -> new HashMap<>())
      .computeIfAbsent(y, y2 -> computePassengers(x, y));
      }

      private long computePassengers(int x, int y) {
      long p = 0;
      for (Edge leg1: adjacency.get(x).values()) {
      if (leg1.b == y) {
      p += 2 * leg1.capacity;
      continue;
      }

      Edge leg2 = adjacency.get(leg1.b).get(y);
      if (leg2 == null) {
      continue;
      }

      p += Math.min(leg1.capacity, leg2.capacity);
      }
      return p;
      }

      Conclusion


      Simple approaches did not fit problem constraints.

      We looked at doing offline computation. To get query in O(1), caching all values required O(N^2) space complexity. Calculating them was O(M^2) runtime complexity. Both size are impractical.

      For online computation, 1 query's runtime was at first O(M) with optimized data structure using map in adjacency list.

      It turned out that simple further optimizations such as caching previous queries and picking node with smaller degree did the trick to reduce overall runtime from O(Q*M) down to O(Q^(1/2)*M).