橙色云资讯 - 工业互联网行业信息门户

A Machine Learning-Based Approximation of Strong Branching

程序猿声 2021-08-09

955 字丨阅读本文需 4 分钟

论文阅读笔记，个人理解，如有错误请指正，感激不尽！该文分类到Machine learning alongside optimization algorithms。

01 Introduction

目前大部分 mixed-integer programming (MIP) solvers 都是基于 branch-and-bound (B&B) 算法。近几年很多新的特性如cutting planes, presolve, heuristics, and advanced branching strategie等都被添加到了MIP solvers上以提高求解的效率。

branching strategie对MIP的求解效果有着很大的影响，也有很多文献对其进行了研究。

最简单的branching strategie为most-infeasible branching，即每次选择小数部分最接近0.5的变量进行分支。还有pseudocost branching，该策略首先对分支过程中的dual bound increases进行一个记录，然后在分支的时候利用该信息对候选变量的dual bound increases进行一个评估，这种方法有个缺点就是一开始的时候没有足够的信息可供使用。strong branching则克服了这一缺点，它通过计算每个fractional variable分支后的linear programming (LP) relaxations，从而显式地评估dual bound increases，然后最大的 increases就作为当前分支的节点。它的效果是显而易见的，但是，分支节点过多，每次求解LP relaxations需要花费过多的时间，导致了strong branching的求解效率过低。当然还有很多分支策略，例如reliability branching, inference branching和nonchimerical branching等，感兴趣的可以去读读文献。

这篇文章的contribution在于，利用机器学习的方法，对strong branching进行了学习，并模型集成到B&B算法的框架中，以加速算法的求解速度。

这篇文章处理的二进制MILP问题有如下的形式：

其中，分别表示成本系数和系数矩阵。在右边，和分别为整数变量和实数变量的下标集合。

02 Learning Branching Decisions

本文并不是提出一个新的branching heuristic，而是使用machine learning通过学习模仿其他分支策略生成的决策去构建一个branching strategy。本文的目标是创造一个效率更高的strong branching的approximation。

2.1 Data Set Generation

机器学习首先是收集数据对，其中为当前分支节点中待分支变量的特征值，而则是该特征对应的标签。对模型进行训练。收集数据可以使用strong branching对training problems进行求解，并将求解的中间过程记录下来。具体就是每个节点分支变量的特征以及标签值，这些数据最终作为机器学习算法的输入而对模型进行训练。

2.2 Machine Learning Algorithm

在这篇文章中，使用的算法为Extremely Randomized Trees或者ExtraTrees。ExtraTrees是从随机森林直接修改过来的，之所以选择ExtraTrees是因为它对于参数的取值具有较强的robust性。

03 Features Describing the Current Subproblem

特征对于机器学习算法来说是非常重要的，他们对方法的有效性起着关键作用。一方面，特征需要足够完整和精确以更加准确地描述子问题。而另一方面，他们又需要在计算中足够高效。这两方面是需要权衡的，因为有一些特征对方法的效果起着非常显著的作用，但是计算的代价又非常大。比如，在分支过程中，对某支进行分支时LP目标值的提升值，就是一个非常好的特征，也在strong branching中使用了。但是计算这个值需要消耗的代价还是太大了，因此不适合该文的算法。

至于要选择哪些特征，又是处于什么样的考虑，作者也说了很多。感兴趣的可以看看论文。下面是详细的特征介绍，比较重要，因此直接把原文搬过来了。

3.1 Static Problem FeaturesThe first set of features are computed from the sole parameters and . They are calculated once and for all, and they represent the static state of the problem. Their goal is to give an overall description of the problem.Besides the sign of the element , we also use and . Distinguishing both is important, because the sign of the coefficient in the cost function is of utmost importance for evaluating the impact of a variable on the objective value.The second class of static features is meant to represent the influence of the coefficients of variable in the coefficient matrix . We develop three measures—namely, , and — that describe variable within the problem in terms of the constraint . Once the values of the measure are computed, the corresponding features added to the feature vector are given by and . The rationale behind this choice is that, when it comes to describing the constraints of a given problem, only the extreme values are relevant.The first measure is composed of two parts: computed by such that , and computed by such that . The minimum and maximum values of and are used as features, to indicate by how much a variable contributes to the constraint violations.Measure models the relationship between thecost of a variable and the coefficients of the same variable in the constraints. Similarly to the first measure, is split in with , and with . As for the previous measure, the feature vector contains both the minimum and the maximum values of and .Finally, the third measure represents the inter-variable relationships within the constraints. The measure is split into and . is in turn divided in and that are calculated using the formula of for and , respectively. The same splitting is performed for . Again, the minimum and maximum of the four computed for all constraints are added to the features.3.2 Dynamic Problem Features

The second type of features are related to the solution of the problem at the current B&B node. Those features contain the proportion of fixed variables at the current solution, the up and down fractionalities of variable , the up and down Driebeek penalties (Driebeek 1966) corresponding to variable , normalized by the objective value at the current node, and the sensitivity range of the objective function coefficient of variable , also normalized by .

3.3 Dynamic Optimization Features

The last set of features is meant to represent the overall state of the optimization. These features summarize global information that is not available from the single current node. When branching is performed on a variable, the objective increases are stored for that variable. From these numbers, we extract statistics for each variable: the minimum, the maximum, the mean, the standard deviation, and the quartiles of the objective increases. These statistics are used as features to describe the variable for which they were computed.

As those features should be independent of the scale of the problem, we divide each objective increase by the objective value at the current node, such that the computed statistics correspond to the relative objective increase for each variable. Finally, the last feature added to this subset is the number of times variable i has been chosen as branching variable, normalized by the total number of branchings performed.

04 Experiments(steps 1) we generate a data set using strong branching,(steps 2) we learn from it a branching heuristic, and(steps 3) we compare the learned heuristic with other branching strategies on various problems.4.1 Problem Setstwo types of problem sets, namely randomly generated problem sets and standard benchmark problems from the MIPLIB.The random problem sets are used for both training (steps 1 and 2) and assessing (step 3) the heuristics. MIPLIB problems are only used for assessment (step 3).

The possible constraints are chosen among set covering (SC), multiknapsack (MKN), bin packing (BP), and equality (EQ) constraints. We generated problems that contain constraints of type BP-EQ, BP-SC, and MKN-SC.

As some of those problems are going to be used to generate the training data set, we randomly split each family into a train and a test set. In the end, we have six data sets: BPEQ_train, BPEQ_test, BPSC_train, BPSC_test, MKNSC_train, and MKNSC_test.

The chosen parameters of ExtraTrees are , and .

4.3 Results

We compare our approach to five other branching strategies, namely random branching (random), most-infeasible branching (MIB), nonchimerical branching (NCB) (Fischetti and Monaci 2012), full strong branching (FSB), and reliability branching (RB) (Achterberg et al. 2005).

In these tables (Tables 4, 5, and 7), Cl. Gap refers to the closed gap and S/T indicates the number of problems solved within the provided nodes or time limit versus the total number of problems. Nodes and Time, respectively, represent the number of explored nodes and the time spent (in seconds) before the optimization either finds the optimal solution or stops earlier because of one stopping criterion.

The normal learned branching strategy (i.e., (, all)) is learned based on a data set containing samples from the three types of training problems (i.e., BPSC, BPEQ, and MKNSC).

Finally, Tables 6 and 7 report the results form our last set of experiments.

Additionally, in another experiment, we let CPLEX use cuts and heuristics (with default CPLEX parameters) in the course of the optimization to observe their impact on the efficiency of each branching strategy.

Overall, our method compares favorably to its competitors when cuts and heuristics are used by CPLEX. Indeed, in that case, our learned branching strategy is the fastest (almost three times faster than the second fastest method (i.e., MIB)) to solve all 30 considered problems.

Note that the apparent bad results of RB are due to three problems that are especially hard for that branching heuristic (air04, air05, and mod011).

Things appear to be different when cuts and heuristics are not used. Indeed, based on the results of Table 7, our method seems to be very slow, but the large number of nodes and the large amount of time is actually due to a small number of problems for which the method does not work well.

A possible explanation for why our method does not perform well on those problems can be that these problems, because too large, are not well represented in the data set that we use to learn the branching strategy.

Reference[1] Alejandro Marcos Alvarez, Quentin Louveaux, Louis Wehenkel (2017) A Machine Learning-Based Approximation of Strong Branching. INFORMS Journal on Computing 29(1):185-195. http://dx.doi.org/10.1287/ijoc.2016.0723

免责声明：凡注明来源本网的所有作品，均为本网合法拥有版权或有权使用的作品，欢迎转载，注明出处本网。非本网作品均来自其他媒体，转载目的在于传递更多信息，并不代表本网赞同其观点和对其真实性负责。如您发现有任何侵权内容，请依照下方联系方式进行沟通，我们将第一时间进行处理。

0赞好资讯，需要你的鼓励

来自：程序猿声

0 0

参与评论

登录后参与讨论 0/1000

下一篇如何启动云计算应用程序开发

开发云计算应用程序成为企业发展业务的必要...

2021-11-24

A Machine Learning-Based Approximation of Strong Branching

参与评论

协同+研发

400-800-1557

我是需求方

我是服务商

交易保障

帮助中心

工程社区

A Machine Learning-Based Approximation of Strong Branching

参与评论

为你推荐

天玑8000加持！Redmi Note 12系列入网：首次搭载旗舰级芯片

智能网联汽车周报(6月第三周) | 《汽车软件升级通用技术要求》征求意见；中国牵头成立ISO车载激光雷达工作组

智能网联汽车周报(7月第一周) | 广州智能网联自动驾驶混行试点南沙首发；浙江德清颁发L4级自动驾驶卡车路测牌照

半年融资827亿，1.2万字看懂新能源汽车产业图谱

燧原科技完成C＋轮融资，大基金二期入股｜ 镁客网每周硬科技领域投融资汇总（8.6-8.12）

智能网联汽车周报(3月第三周) | 工信部将开展汽车软件在线升级管理试点；湖南发布智能网联汽车产业“十四五”规划

智能网联汽车周报(6月第一周) | 长安6月30日起关闭远程智能泊车等功能；首款前装量产全无人驾驶环卫车批量交付

智能网联汽车周报(1月第一周) | 首个自动驾驶公交车标准发布；京台高速部分路段为智能网联车测试道路

中国智慧城市“药方”如何化解海外城市治理症结？

智能网联汽车周报(11月第三周) | 工信部发布十四五信息通信行业发展规划，福特车路协同系统正式落地广州

AI新基建：AIoT领域群雄逐鹿，七家标杆企业各有什么过人之处？

RISC-V只适用于嵌入式芯片？以IoT为立足点，RISC-V正在“征服”整个半导体世界

下一个消费电子蓝海，VR/AR市场海外玩家走到哪一步，中国大厂布局情况又如何？

虚拟偶像、虚拟主播逐渐出现在人们的视野中，未来虚拟人行业将向何发展？

“灯塔工厂”的中国路径：智造从点到面铺开

国产屏厂TCL华星要打入苹果供应链：为iPad、MacBook出货LCD

国产CIS开始向2亿像素时代迈进，巨头垄断格局或将被打破

AI健身/3D试衣/AI烤肉……黑科技扎堆的AWE2021都有哪些硬货？

Nreal国内首发AR眼镜，消费级AR眼镜潮即将爆发，VR/AR产业链开启“卡位战”

研究人员研发仿生二合一纳米电子传感器，可同时测量细胞电气和机械反应

国家电投与阿里巴巴宣布战略合作，数字化转型这条路上，央企各显神通

新材料七国的竞争版图：热门行业分布大集锦，未来10年的市场机会或许在这

智能网联汽车周报(8月第一周) | 自然资源部开展汽车高精度地图应用试点；智能网联汽车测试示范区评估结果发布

SiC迎来“奇点时刻”，盘点近期产能、需求量、交易等行业大事件

智能网联汽车周报(1月第三周) | 北京建设2022年高级别自动驾驶示范区3.0；上海将制定智能网联汽车终端发展计划

年内 IPO 上市？哪吒汽车完成超 20 亿融资｜ 镁客网每周硬科技领域投融资汇总

下一个10年，将会发生的12个趋势

从工业机器人产业链看智能制造机会

最新总结：冬奥会的100个创业机会

后疫情时代，OTA如何回血

相关推荐

协同+研发

400-800-1557

我是需求方

我是服务商

交易保障

帮助中心

工程社区

燧原科技完成C＋轮融资，大基金二期入股｜镁客网每周硬科技领域投融资汇总（8.6-8.12）

年内 IPO 上市？哪吒汽车完成超 20 亿融资｜镁客网每周硬科技领域投融资汇总