Wednesday, September 7, 2022

Meta Hacker Cup 2022, Qualification Round, Problem D, Second Flight

Introduction


In Problem D, we need to return the max number of passengers who can travel from airport X to airport Y, using either

  1. a direct flight, available in the morning & the evening
  2. a 2-leg flight, going through 1 intermediate airport

Graph contains N nodes and M edges. We need to run Q queries overall.


Interface


The end-goal is to implement following interface within the problem constraints,


interface SecondFlightService {
void addFlight(int x, int y, int capacity);
void setConnectionCapacity();
long passengers(int x, int y);
}

  • addFlight is called for each edge to add it
  • setConnectionCapacity is called once after all edges were added
  • passengers is called to process the query

Input Parsing


package hackercup2022;

import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.PrintWriter;
import java.util.ArrayList;
import java.util.List;

public class SecondFlight implements SecondFlightService {

public static void main(String[] args) throws Exception {
//InputStream inputStream = System.in;
InputStream inputStream = new FileInputStream("second_flight_sample_input.txt");
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));

//OutputStream outputStream = System.out;
OutputStream outputStream = new FileOutputStream("second_flight_sample_output.txt");
PrintWriter writer = new PrintWriter(new BufferedOutputStream(outputStream));

String[] tokens;

tokens = bufferedReader.readLine().split(" ");
int T = Integer.parseInt(tokens[0]);
for (int t = 1; t <= T; t++) {
tokens = bufferedReader.readLine().split(" ");
int N = Integer.parseInt(tokens[0]);
int M = Integer.parseInt(tokens[1]);
int Q = Integer.parseInt(tokens[2]);
int[] A = new int[M];
int[] B = new int[M];
int[] C = new int[M];
SecondFlightService secondFlight = new SecondFlight(N);
for (int i = 0; i < M; i++) {
tokens = bufferedReader.readLine().split(" ");
A[i] = Integer.parseInt(tokens[0]);
B[i] = Integer.parseInt(tokens[1]);
C[i] = Integer.parseInt(tokens[2]);
secondFlight.addFlight(A[i], B[i], C[i]);
}

secondFlight.setConnectionCapacity();

writer.print(String.format("Case #%d:", t));
for (int i = 0; i < Q; i++) {
tokens = bufferedReader.readLine().split(" ");
int x = Integer.parseInt(tokens[0]);
int y = Integer.parseInt(tokens[1]);
writer.print(" " + secondFlight.passengers(x, y));
}
writer.println();
}

writer.close();
outputStream.close();
inputStream.close();
}
}


Solution with adjacency matrix


We iterate over the list of nodes. We maintain 2 2D arrays, capacity vs connectionCapacity, to maintain the counts for direct vs indirect flights.


public class SecondFlight {


/**
* number of airports
*/
int N;

public SecondFlight(int N) {
this.N = N;
capacity = new int[N+1][N+1];
connectionCapacity = new int[N+1][N+1];
}

int[][] capacity;
int[][] connectionCapacity;
public void addFlight(int a, int b, int c) {
capacity[a][b] = c;
capacity[b][a] = c;
}

public void setConnectionCapacity() {
for (int x = 1; x <= N; x++) {
for (int y = 1; y <= N; y++) {
if (capacity[x][y] == 0) {
continue;
}
for (int z = 1; z <= N; z++) {
if (capacity[y][z] == 0 || x == z) {
continue;
}

connectionCapacity[x][z] += Math.min(capacity[x][y], capacity[y][z]);
connectionCapacity[z][x] = connectionCapacity[z][x];
}
}
}
}

public int passengers(int x, int y) {
return 2 * capacity[x][y] + connectionCapacity[x][y];
}
}


This approach runs in O(N^3) runtime complexity, O(N^2) space complexity.

We cache all possible values. Each 2-leg flight [x, y, z] contribute to the count for associated (x, z) pair.

The matrix is symmetrical since we use an undirected graph. A flight going in either direction [x,y] or [y,x] has the same capacity.


Caching all values


We now switch from adjacency matrix to adjacency list.

We use directed edges to represent the graph.


static class Edge {
int a, b, capacity;
Edge(int a, int b, int capacity) {
this.a = a;
this.b = b;
this.capacity = capacity;
}
}


We iterate over List<Edge> the list of edges, instead of nodes.

We lookup the list of the airports connected to any airport in List<List<Edge> adjacency list.


public class SecondFlight {


/**
* number of airports
*/
int N;

int[][] capacity;


List<Edge> edges = new ArrayList<>();
List<List<Edge>> adjacency = new ArrayList<>();

public SecondFlight(int N) {
this.N = N;
capacity = new int[N+1][N+1];

for (int i = 0; i <= N; i++) {
adjacency.add(new ArrayList<>());
}
}

public void addFlight(int a, int b, int c) {
addEdge(new Edge(a, b, c));
addEdge(new Edge(b, a, c));
}

private void addEdge(Edge edge) {
edges.add(edge);
adjacency.get(edge.a).add(edge);
capacity[edge.a][edge.b] = 2*edge.capacity;
}


public void setConnectionCapacity() {
edges.forEach(this::addSecondLeg);
}

private void addSecondLeg(Edge leg1) {
for (Edge leg2 : adjacency.get(leg1.b)) {
if (leg2.b != leg1.a) {
capacity[leg1.a][leg2.b] += Math.min(leg1.capacity, leg2.capacity);
}
}
}

public int passengers(int x, int y) {
return capacity[x][y];
}
}


What's the new runtime complexity?

For each edge leg1, we iterate over all the neighbors of the destination airport of the flight represented by leg1. If we don't end up with a round trip, we update the count associated to the considered 2-leg flight, [leg1.a, leg1.b, leg2.b]

Each edge incoming to a node v can be used as leg1, each edge outgoing from v can be used as leg2, as long as it's not the reverse of leg1.

Since this is undirected graph, we have

deg_in(v) = deg_out(v) = deg(v)

so the overall complexity is

Sum(deg(v)^2) <= (Sum(deg(v))^2 = M^2

To summarize,

  • runtime complexity is O(M^2)
  • space complexity is O(N^2)

which won't work with problem constraint

1 <= M,N <= 200 000


Computing the query


Computing the query instead of precomputing it does not require the 2-D array:


public class SecondFlight implements SecondFlightService {


/**
* number of airports
*/
int N;

static class Edge {
int a, b, capacity;
Edge(int a, int b, int capacity) {
this.a = a;
this.b = b;
this.capacity = capacity;
}

@Override
public String toString() {
return String.format("(%d->%d) %d", a, b, capacity);
}
}
List<Map<Integer, Edge>> adjacency = new ArrayList<>();

public SecondFlight(int N) {
this.N = N;
for (int i = 0; i <= N; i++) {
adjacency.add(new HashMap<>());
}
}

public void addFlight(int a, int b, int c) {
addEdge(new Edge(a, b, c));
addEdge(new Edge(b, a, c));
}

private void addEdge(Edge edge) {
adjacency.get(edge.a).put(edge.b, edge);
}

public void setConnectionCapacity() {
}

public long passengers(int x, int y) {
long p = 0;
for (Edge leg1: adjacency.get(x).values()) {
if (leg1.b == y) {
p += 2 * leg1.capacity;
continue;
}

Edge leg2 = adjacency.get(leg1.b).get(y);
if (leg2 == null) {
continue;
}

p += Math.min(leg1.capacity, leg2.capacity);
}
return p;
}
}


We switched from List<Edge> to  Map<Integer, Edge> in adjacency list.

The complexity of calculating the result is O(deg(x)) which is O(M).


Optimizations


Optimizations outlined in the editorial are


Caching query results


We cache the results in a 2D map, Map<Integer, Map<Integer, Long>>.

Selecting the node with smaller degree


We check adjacency list size to pick start node with fewer neighbors.

Solution


The new version for the query logics is

Map<Integer, Map<Integer, Long>> results = new HashMap<>();

public long passengers(int x, int y) {
if (adjacency.get(x).size() > adjacency.get(y).size()) {
return passengers(y, x);
}
// assert x has fewer neighbors than y

return results.computeIfAbsent(x, x2 -> new HashMap<>())
.computeIfAbsent(y, y2 -> computePassengers(x, y));
}

private long computePassengers(int x, int y) {
long p = 0;
for (Edge leg1: adjacency.get(x).values()) {
if (leg1.b == y) {
p += 2 * leg1.capacity;
continue;
}

Edge leg2 = adjacency.get(leg1.b).get(y);
if (leg2 == null) {
continue;
}

p += Math.min(leg1.capacity, leg2.capacity);
}
return p;
}

Conclusion


Simple approaches did not fit problem constraints.

We looked at doing offline computation. To get query in O(1), caching all values required O(N^2) space complexity. Calculating them was O(M^2) runtime complexity. Both size are impractical.

For online computation, 1 query's runtime was at first O(M) with optimized data structure using map in adjacency list.

It turned out that simple further optimizations such as caching previous queries and picking node with smaller degree did the trick to reduce overall runtime from O(Q*M) down to O(Q^(1/2)*M).