Abstract

The vision of Industry 4.0 is to materialize the notion of a lot-size of one through enhanced adaptability and resilience of manufacturing and logistics operations to dynamic changes or deviations on the shop floor. This article is motivated by the lack of formal methods for efficient transfer of knowledge across different yet interrelated tasks, with special reference to collaborative robotic operations such as material handling, machine tending, assembly, and inspection. We propose a meta reinforcement learning framework to enhance the adaptability of collaborative robots to new tasks through task modularization and efficient transfer of policies from previously learned task modules. Our experiments on the OpenAI Gym Robotics environments Reach, Push, and Pick-and-Place indicate an average 75% reduction in the number of iterations to achieve a 60% success rate as well as a 50%-80% improvement in task completion efficiency, compared to the deep deterministic policy gradient (DDPG) algorithm as a baseline. The significant improvements achieved in the jumpstart and asymptotic performance of the robot create new opportunities for investigating the current limitations of learning robots in industrial settings, associated with sample inefficiency and specialization on one task through modularization and transfer learning.

References

1.
Lasi
,
H.
,
Fettke
,
P.
,
Kemper
,
H. G.
,
Feld
,
T.
, and
Hoffmann
,
M.
,
2014
, “
Industry 4.0
,”
Business Inform. Syst. Eng.
,
6
(
4
), pp.
239
242
. 10.1007/s12599-014-0334-4
2.
Monostori
,
L.
,
Kádár
,
B.
,
Bauernhansl
,
T.
,
Kondoh
,
S.
,
Kumara
,
S.
,
Reinhart
,
G.
,
Sauer
,
O.
,
Schuh
,
G.
,
Sihn
,
W.
, and
Ueda
,
K.
,
2016
, “
Cyber-Physical Systems in Manufacturing
,”
CIRP Ann. - Manufact. Technol.
,
65
(
2
), pp.
621
641
. 10.1016/j.cirp.2016.06.005
3.
Moghaddam
,
M.
,
Cadavid
,
M. N.
,
Kenley
,
C. R.
, and
Deshmukh
,
A. V.
,
2018
, “
Reference Architectures for Smart Manufacturing: A Critical Review
,”
J. Manuf. Syst.
,
49
, pp.
215
225
. 10.1016/j.jmsy.2018.10.006
4.
Luder
,
A.
,
Schleipen
,
M.
,
Schmidt
,
N.
,
Pfrommer
,
J.
, and
Hensen
,
R.
,
2017
, “
One Step Towards An Industry 4.0 Component
,”
13th IEEE Conference on Automation Science and Engineering (CASE)
,
Xi'an, China
, IEEE, pp.
1268
1273
.
5.
Moghaddam
,
M.
, and
Nof
,
S. Y.
,
2017
, “
The Collaborative Factory of the Future
,”
Int. J. Computer Int. Manufact.
,
30
(
1
), pp.
23
43
.
6.
Koren
,
Y.
,
Heisel
,
U.
,
Jovane
,
F.
,
Moriwaki
,
T.
,
Pritschow
,
G.
,
Ulsoy
,
G.
, and
Van Brussel
,
H.
,
1999
, “
Reconfigurable Manufacturing Systems
,”
CIRP Ann. - Manufact. Technol.
,
48
(
2
), pp.
527
540
. 10.1016/S0007-8506(07)63232-6
7.
Hofmann
,
Erik
, and
Rüsch
,
Marco
,
2017
, “
Industry 4.0 and the Current Status As Well As Future Prospects on Logistics
,”
Comput. Indus.
,
89
, pp.
23
34
. 10.1016/j.compind.2017.04.002
8.
Lu
,
Y.
,
2017
, “
Industry 4.0: A Survey on Technologies, Applications and Open Research Issues
,”
J. Indus. Inform. Int.
,
6
, pp.
1
10
.
9.
Zhong
,
R. Y.
,
Xu
,
X.
,
Klotz
,
E.
, and
Newman
,
S. T.
,
2017
, “
Intelligent Manufacturing in the Context of Industry 4.0: A Review
,”
Engineering
,
3
(
5
), pp.
616
630
. 10.1016/J.ENG.2017.05.015
10.
Østergaard
,
E. H.
, The Role of Cobots in Industry 4.0, 2018.
11.
Collaborative Robots Market Worth $10.1 Billion By 2025 — CAGR: 44.5%, 2019.
12.
Malik
,
A. A.
, and
Bilberg
,
A.
,
2019
, “
Collaborative Robots in Assembly: A Practical Approach for Tasks Distribution
,”
52nd CIRP Conference on Manufacturing Systems
,
Ljubljana, Slovenia
, Vol.
81
, pp.
665
670
.
13.
Dobra
,
Z.
, and
Dhir
,
K. S.
,
2020
, “
Technology Jump in the Industry: Human–Robot Cooperation in Production
,”
Indus. Robot: Int. J. Rob. Res. Appl.
,
47
(
5
), p.
0039
.
14.
Bagheri
,
B.
,
Yang
,
S.
,
Kao
,
H.-A.
, and
Lee
,
J.
,
2015
, “
Cyber-Physical Systems Architecture for Self-Aware Machines in Industry 4.0 Environment
,”
IFAC-PapersOnLine
,
48
(
3
), pp.
1622
1627
. 10.1016/j.ifacol.2015.06.318
15.
Weyer
,
S.
,
Schmitt
,
M.
,
Ohmer
,
M.
, and
Gorecky
,
D.
,
2015
, “
Towards Industry 4.0-Standardization As the Crucial Challenge for Highly Modular, Multi-Vendor Production Systems
,”
Ifac-Papersonline
,
48
(
3
), pp.
579
584
. 10.1016/j.ifacol.2015.06.143
16.
Hosseini
,
S.
,
Morshedlou
,
N.
,
Ivanov
,
D.
,
Sarder
,
M. D.
,
Barker
,
K.
, and
Khaled
,
A. A.
,
2019
, “
Resilient Supplier Selection and Optimal Order Allocation Under Disruption Risks
,”
Int. J. Prod. Econ.
,
213
, pp.
124
137
. 10.1016/j.ijpe.2019.03.018
17.
Ivanov
,
D.
, and
Dolgui
,
A.
,
2020
, “
A Digital Supply Chain Twin for Managing the Disruption Risks and Resilience in the Era of Industry 4.0
,”
Prod. Planning Control
, pp.
1
14
. 10.1080/09537287.2020.1768450
18.
Odonkor
,
P.
, and
Lewis
,
K.
,
2019
, “
Data-driven Design of Control Strategies for Distributed Energy Systems
,”
ASME J. Mech. Design
,
141
(
11
), p.
111404
. 10.1115/1.4044077
19.
Odonkor
,
P.
, and
Lewis
,
K.
,
2019
, “
Automated Design of Energy Efficient Control Strategies for Building Clusters Using Reinforcement Learning
,”
ASME J. Mech. Design
,
141
(
2
), p.
021704
. 10.1115/1.4041629
20.
Liao
,
H.
,
Zhang
,
W.
,
Dong
,
X.
,
Poczos
,
B.
,
Shimada
,
K.
, and
Kara
,
L. B.
,
2020
, “
A Deep Reinforcement Learning Approach for Global Routing
,”
ASME J. Mech. Des.
,
142
(
6
), p.
061701
. 10.1115/1.4045044
21.
Lee
,
X. Y.
,
Balu
,
A.
,
Stoecklein
,
D.
,
Ganapathysubramanian
,
B.
, and
Sarkar
,
S.
,
2019
, “
A Case Study of Deep Reinforcement Learning for Engineering Design: Application to Microfluidic Devices for Flow Sculpting
,”
ASME J. Mech. Des.
,
141
(
11
), p.
111401
. https://doi.org/10.1115/1.4044397
22.
Panchal
,
J. H.
,
Fuge
,
M.
,
Liu
,
Y.
,
Missoum
,
S.
, and
Tucker
,
C.
,
2019
, “
Special Issue: Machine Learning for Engineering Design
,”
ASME J. Mech. Des.
,
141
(
11
), p.
110301
. 10.1115/1.4044690
23.
Wang
,
J. X.
,
Kurth-Nelson
,
Z.
,
Tirumala
,
D.
,
Soyer
,
H.
,
Leibo
,
J. Z.
,
Munos
,
R.
,
Blundell
,
C.
,
Kumaran
,
D.
, and
Botvinick
,
M.
, “
Learning to Reinforcement Learn
,” pp.
1
17
,
2016
.
24.
Battaglia
,
P. W.
,
Hamrick
,
J. B.
,
Bapst
,
V.
,
Sanchez-Gonzalez
,
A.
,
Zambaldi
,
V.
,
Malinowski
,
M.
,
Tacchetti
,
A.
,
Raposo
,
D.
,
Santoro
,
A.
,
Faulkner
,
R.
,
Gulcehre
,
C.
,
Song
,
F.
,
Ballard
,
A.
,
Gilmer
,
J.
,
Dahl
,
G.
,
Vaswani
,
A.
,
Allen
,
K.
,
Nash
,
C.
,
Langston
,
V.
,
Dyer
,
C.
,
Heess
,
N.
,
Wierstra
,
D.
,
Kohli
,
P.
,
Botvinick
,
M.
,
Vinyals
,
O.
,
Li
,
Y.
, and
Pascanu
,
R.
,
2018
, “
Relational inductive biases, deep learning, and graph networks
,” pp.
1
40
.
25.
Gupta
,
A.
,
Mendonca
,
R.
,
Liu
,
Y. X.
,
Abbeel
,
P.
, and
Levine
,
S.
,
2018
, “
Meta-Reinforcement Learning of Structured Exploration Strategies
,”
Adv. Neural Inform Process. Syst.
,
2018
(ISSUEDecem(NeurIPS)), pp.
5302
5311
.
26.
Ritter
,
S.
,
Wang
,
J. X.
,
Kurth-Nelson
,
Z.
,
Jayakumar
,
S. M.
,
Blundell
,
C.
,
Pascanu
,
R.
, and
Botvinick
,
M.
,
2018
, “
Been There, Done That: Meta-Learning With Episodic Recall
,”
35th International Conference on Machine Learning, ICML 2018
,
Stockholm, Sweden
, Vol.
10
, pp.
6929
6938
.
27.
Botvinick
,
M.
,
Ritter
,
S.
,
Wang
,
J. X.
,
Kurth-Nelson
,
Z.
,
Blundell
,
C.
, and
Hassabis
,
D.
,
2019
, “
Reinforcement Learning, Fast and Slow
,”
Trends Cognit. Sci.
,
23
(
5
), pp.
408
422
. 10.1016/j.tics.2019.02.006
28.
Levine
,
S.
,
Finn
,
C.
,
Darrell
,
T.
, and
Abbeel
,
P.
,
2016
, “
End-to-End Training of Deep Visuomotor Policies
,”
J. Mach. Lear. Res.
.
29.
Duan
,
Y.
,
Schulman
,
J.
,
Chen
,
X.
,
Bartlett
,
P. L.
,
Sutskever
,
I.
, and
Abbeel
,
P.
,
2017
, “
RL2: Fast Reinforcement Learning via Slow Reinforcement Learning
,”
5th International Conference on Learning Representations
, pp.
1
14
.
30.
Tamar
,
A.
,
Thomas
,
G.
,
Zhang
,
T.
,
Levine
,
S.
, and
Abbeel
,
P.
,
2017
, “
Learning From the Hindsight Plan – Episodic MPC Improvement
,”
ICRA 2017 – IEEE International Conference on Robotics and Automation
,
Marina Bay Sands, Singapore
.
31.
Yu
,
T.
,
Quillen
,
D.
,
He
,
Z.
,
Julian
,
R.
,
Hausman
,
K.
,
Finn
,
C.
, and
Levine
,
S.
,
2020
, “
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
,”
4th Annual Conference on Robot Learning (CoRL2020)
,
Cambridge, MA
.
32.
Schaul
,
T.
, and
Schmidhuber
,
J.
,
2010
,
Metalearning
.
33.
Mnih
,
V.
,
Kavukcuoglu
,
K.
,
Silver
,
D.
,
Rusu
,
A. A.
,
Veness
,
J.
,
Bellemare
,
M. G.
,
Graves
,
A.
,
Riedmiller
,
M.
,
Fidjeland
,
Andreas K.
,
Ostrovski
,
G.
,
Petersen
,
S.
,
Beattie
,
C.
,
Sadik
,
A.
,
Antonoglou
,
I.
,
King
,
H.
,
Kumaran
,
D.
,
Wierstra
,
D.
,
Legg
,
S.
, and
Hassabis
,
D.
,
2015
, “
Human-level Control Through Deep Reinforcement Learning
,”
Nature
,
518
(
7540
), pp.
529
533
. 10.1038/nature14236
34.
Silver
,
D.
, and
Hassabis
,
D.
,
2017
, AlphaGo Zero: Starting From Scratch.
35.
Moravčík
,
M.
,
Schmid
,
M.
,
Burch
,
N.
,
Lisý
,
V.
,
Morrill
,
D.
,
Bard
,
N.
,
Davis
,
T.
,
Waugh
,
K.
,
Johanson
,
M.
, and
Bowling
,
M.
,
2017
, “
DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker
,”
Science
,
356
(
6337
), pp.
508
513
. 10.1126/science.aam6960
36.
Gupta
,
A.
,
Eysenbach
,
B.
,
Finn
,
C.
, and
Levine
,
S.
,
2018
, Unsupervised Meta-Learning for Reinforcement Learning.
37.
Blundell
,
C.
,
Uria
,
B.
,
Pritzel
,
A.
,
Li
,
Y.
,
Ruderman
,
A.
,
Leibo
,
J. Z.
,
Rae
,
J.
,
Wierstra
,
D.
, and
Hassabis
,
D.
,
2019
, “
Model-Free Episodic Control
,” pp.
1
12
.
38.
Pritzel
,
A.
,
Uria
,
B.
,
Srinivasan
,
S.
,
Badia
,
A. P.
,
Vinyals
,
O.
,
Hassabis
,
D.
,
Wierstra
,
D.
, and
Blundell
,
C.
,
2017
, “
Neural Episodic Control
,”
34th International Conference on Machine Learning, ICML 2017
,
Stockholm, Sweden
, Vol.
6
, pp.
4320
4331
.
39.
Devin
,
C.
,
Gupta
,
A.
,
Darrell
,
T.
,
Abbeel
,
P.
, and
Levine
,
S.
,
2017
, “
Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer
,”
ICRA 2017 – IEEE International Conference on Robotics and Automation
.
40.
Alet
,
F.
,
Lozano-Pérez
,
T.
, and
Kaelbling
,
L. P.
, Modular Meta-Learning. CoRL, 2018.
41.
Simon
,
H. A.
,
1991
, “The Architecture of Complexity,”
Facets of Systems Science
,
Springer US
,
Boston, MA
, pp.
457
476
.
42.
Nolfi
,
S.
,
1997
, “
Using Emergent Modularity to Develop Control Systems for Mobile Robots
,”
Adaptive Behav.
,
5
(
3–4
), pp.
343
363
. 10.1177/105971239700500306
43.
Baldwin
,
C. Y.
, and
Clark
,
K. B.
,
2000
,
Design rules
. Vol.
1
,
The Power of Modularity
,
MIT Press
.
44.
Sullivan
,
K. J.
,
Griswold
,
W. G.
,
Cai
,
Y.
,
Hallen
,
B.
,
Sullivan
,
K. J.
,
Griswold
,
W. G.
,
Cai
,
Y.
, and
Hallen
,
B.
,
2001
, “
The Structure and Value of Modularity in Software Design
,”
8th European Software Engineering Conference held jointly with 9th ACM SIGSOFT International Symposium on Foundations of Software Engineering – ESEC/FSE-9
,
New York, NY
,
ACMPress
, p.
99
.
45.
Gianetto
,
D. A.
, and
Heydari
,
B.
,
2015
, “
Network Modularity is Essential for Evolution of Cooperation Under Uncertainty
,”
Sci. Rep.
,
5
(
1
), p.
9340
. 10.1038/srep09340
46.
Heydari
,
B.
, and
Dalili
,
K.
,
2015
, “
Emergence of Modularity in System of Systems: Complex Networks in Heterogeneous Environments
,”
IEEE Syst. J.
,
9
(
1
), pp.
223
231
. 10.1109/JSYST.2013.2281694
47.
Taylor
,
M. E.
, and
Stone
,
P.
,
2009
, “
Transfer Learning for Reinforcement Learning Domains: A Survey
,”
J. Mach. Learning Res.
,
10
, pp.
1633
1685
.
48.
Brockman
,
G.
,
Cheung
,
V.
,
Pettersson
,
L.
,
Schneider
,
J.
,
Schulman
,
J.
,
Tang
,
J.
, and
Zaremba
,
W.
,
2016
, “
OpenAI Gym. arXiv:1606.01540v1
, pp.
1
4
.
49.
Lillicrap
,
T. P.
,
Hunt
,
J. J.
,
Pritzel
,
A.
,
Heess
,
N.
,
Erez
,
T.
,
Tassa
,
Y.
,
Silver
,
D.
, and
Wierstra
,
D.
, Continuous control with deep reinforcement learning. arXiv:1509.02971 Help — Advanced Search, Sep. 2016.
50.
Sutton
,
R. S.
, and
Barto
,
A. G.
,
1998
,
Reinforcement Learning: An Introduction
,
MIT Press
,
Cambridge, MA
.
51.
Bengio
,
Y.
,
Courville
,
A.
, and
Vincent
,
P.
,
2013
, “
Representation Learning: A Review and New Perspectives
,”
IEEE Trans. Pattern Anal. Mach. Intell.
,
35
(
8
), pp.
1798
1828
. 10.1109/TPAMI.2013.50
52.
Silver
,
D.
,
Huang
,
A.
,
Maddison
,
C. J.
,
Guez
,
A.
,
Sifre
,
L.
,
Van Den Driessche
,
G.
,
Schrittwieser
,
J.
,
Antonoglou
,
I.
,
Panneershelvam
,
V.
,
Lanctot
,
M.
,
Dieleman
,
S.
,
Grewe
,
D.
,
Nham
,
J.
,
Kalchbrenner
,
N.
,
Sutskever
,
I.
,
Lillicrap
,
T.
,
Leach
,
M.
,
Kavukcuoglu
,
K.
,
Graepel
,
T.
, and
Hassabis
,
D.
,
2016
, “
Mastering the Game of Go With Deep Neural Networks and Tree Search
,”
Nature
,
529
, pp.
484
489
. 10.1038/nature16961
53.
Vinyals
,
O.
,
Babuschkin
,
I.
,
Czarnecki
,
W. M.
,
Mathieu
,
M.
,
Dudzik
,
A.
,
Chung
,
J.
,
Choi
,
D. H.
,
Powell
,
R.
,
Ewalds
,
T.
,
Georgiev
,
P.
,
Oh
,
J.
,
Horgan
,
D.
,
Kroiss
,
M.
,
Danihelka
,
I.
,
Huang
,
A.
,
Sifre
,
L.
,
Cai
,
T.
,
Agapiou
,
J. P.
,
Jaderberg
,
M.
,
Vezhnevets
,
A. S.
,
Leblond
,
R.
,
Pohlen
,
T.
,
Dalibard
,
V.
,
Budden
,
D.
,
Sulsky
,
Y.
,
Molloy
,
J.
,
Paine
,
T. L.
,
Gulcehre
,
C.
,
Wang
,
Z.
,
Pfaff
,
T.
,
Wu
,
Y.
,
Ring
,
R.
,
Yogatama
,
D.
,
Wünsch
,
D.
,
McKinney
,
K.
,
Smith
,
O.
,
Schaul
,
T.
,
Lillicrap
,
T.
,
Kavukcuoglu
,
K.
,
Hassabis
,
D.
,
Apps
,
C.
, and
Silver
,
D.
,
2019
, “
Grandmaster Level in StarCraft II Using Multi-agent Reinforcement Learning
,”
Nature
,
575
, pp.
350
354
. 10.1038/s41586-019-1724-z
54.
García
,
J.
, and
Shafie
,
D.
,
2020
, “
Teaching a Humanoid Robot to Walk Faster Through Safe Reinforcement Learning
,”
Eng. Appl. Artificial Intell.
,
88
, p.
103360
. 10.1016/j.engappai.2019.103360
55.
Vengerov
,
D.
,
2007
, “
A Reinforcement Learning Approach to Dynamic Resource Allocation
,”
Eng. Appl. Artificial Intell.
,
20
(
3
), pp.
383
390
. 10.1016/j.engappai.2006.06.019
56.
Wang
,
Y. H.
,
Hseng
,
T.
,
Li
,
S.
, and
Lin
,
C. J.
,
2013
, “
Backward Q-learning: The Combination of Sarsa Algorithm and Q-learning
,”
Eng. Appl. Artificial Intell.
,
26
(
9
), pp.
2184
2193
. 10.1016/j.engappai.2013.06.016
57.
Sutton
,
R. S.
,
McAllester
,
D.
,
Singh
,
S.
, and
Mansour
,
Y.
,
2000
, “Policy Gradient Methods for Reinforcement Learning With Function Approximation,”
Policy Gradient Methods for Reinforcement Learning With Function Approximation
,
MIT Press
,
Cambridge
.
58.
Konda
,
V. R.
, and
Tsitsiklis
,
J. N.
,
2000
, “Actor-Critic Algorithms,”
Actor-Critic Algorithms
,
MIT Press
,
Cambridge
.
59.
Mnih
,
V.
,
Badia
,
A. P.
,
Mirza
,
M.
,
Graves
,
A.
,
Lillicrap
,
T. P.
,
Harley
,
T.
,
Silver
,
D.
, and
Kavukcuoglu
,
K.
,
2016
, “Asynchronous Methods for Deep Reinforcement Learning. 48.
60.
Degris
,
T.
,
White
,
M.
, and
Sutton
,
R. S.
,
2012
, “
Off-Policy Actor-Critic
,”
Proceedings of the 29th International Conference on Machine Learning, ICML 2012
,
Madison, WI
.
61.
Schulman
,
J.
,
Wolski
,
F.
,
Dhariwal
,
P.
,
Radford
,
A.
, and
Klimov
,
O.
,
2017
, “
Proximal Policy Optimization Algorithms
,” pp.
1
12
.
62.
Bucak
,
I. O.
, and
Zohdy
,
M. A.
,
2001
, “
Reinforcement Learning Control of Nonlinear Multi-link System
,”
Eng. Appl. Artificial Intell.
,
14
(
5
), pp.
563
575
. 10.1016/S0952-1976(01)00031-8
63.
Silver
,
D.
,
Lever
,
G.
,
Heess
,
N.
,
Degris
,
T.
,
Wierstra
,
D.
, and
Riedmiller
,
M.
,
2014
, “
Deterministic Policy Gradient Algorithms
,”
31st International Conference on Machine Learning, ICML 2014
,
Beijing, China
.
64.
Schaul
,
T.
,
Quan
,
J.
,
Antonoglou
,
I.
, and
Silver
,
D.
,
2016
, “
Prioritized Experience Replay
,”
4th International Conference on Learning Representations, ICLR 2016 – Conference Track Proceedings
,
San Juan, Puerto Rico
.
65.
Berner
,
C.
,
Brockman
,
G.
,
Chan
,
B.
,
Cheung
,
V.
,
De¸biak
,
P.
,
Dennison
,
C.
,
Farhi
,
D.
,
Fischer
,
Q.
,
Hashme
,
S.
,
Hesse
,
C.
,
Józefowicz
,
R.
,
Gray
,
S.
,
Olsson
,
C.
,
Pachocki
,
J.
,
Petrov
,
M.
,
de Oliveira Pinto
,
H. P.
,
Raiman
,
J.
,
Salimans
,
T.
,
Schlatter
,
J.
,
Schneider
,
J.
,
Sidor
,
S.
,
Sutskever
,
I.
,
Tang
,
J.
,
Wolski
,
F.
, and
Zhang
,
S.
,
2019
, “Dota 2 With Large Scale Deep Reinforcement Learning.
66.
Andrychowicz
,
M.
,
Wolski
,
F.
,
Ray
,
A.
,
Schneider
,
J.
,
Fong
,
R.
,
Welinder
,
P.
,
McGrew
,
B.
,
Tobin
,
J.
,
Abbeel
,
P.
, and
Zaremba
,
W.
,
2017
, “
Hindsight Experience Replay
,”
Adv. Neural Inform. Process. Syst.
,
2017-Dec. (Nips)
, pp.
5049
5059
.
67.
Harlow
,
H. F.
, The Formation of Learning Sets. Psychological Review, 1949.
68.
Moghaddam
,
M.
,
Chen
,
Q.
, and
Deshmukh
,
A. V.
,
2020
, “
A Neuro-Inspired Computational Model for Adaptive Fault Diagnosis
,”
Expert Syst. Appl.
,
140
, p.
112879
. 10.1016/j.eswa.2019.112879
69.
Bengio
,
Y.
,
2017
, The Consciousness Prior. 1–7.
70.
Mishra
,
N.
,
Rohaninejad
,
M.
,
Chen
,
X.
, and
Abbeel
,
P.
,
2018
, “
A Simple Neural Attentive Meta-learner
,”
6th International Conference on Learning Representations, ICLR 2018—Conference Track Proceedings
,
Vancouver, BC, Canada
, pp.
1
17
.
71.
Simon
,
H. A.
,
1962
, “
The Architecture of Complexity
,”
Proc. Am. Philosophical Soc.
,
106
(
6
), pp.
467
482
.
72.
Eppinger
,
S.
, and
Ulrich
,
K.
,
2015
,
Product Design and Development
,
McGraw-Hill Higher Education
,
New York
.
73.
Baldwin
,
C. Y.
, and
Clark
,
K. B.
,
2000
,
Design Rules: The Power of Modularity
, Vol.
1
,
MIT Press
,
Cambridge
.
74.
Moore
,
W. L.
,
Louviere
,
J. J.
, and
Verma
,
R.
,
1999
, “
Using Conjoint Analysis to Help Design Product Platforms
,”
J. Product Innovat. Manage.
,
16
(
1
), pp.
27
39
. 10.1111/1540-5885.1610027
75.
Mosleh
,
M.
, and
Heydari
,
B.
,
2017
, “
Fair Topologies: Community Structures and Network Hubs Drive Emergence of Fairness Norms
,”
Sci. Rep.
,
7
(
1
), p.
2686
. 10.1038/s41598-017-01876-0
76.
Wixson
,
L. E.
,
1991
, “Scaling Reinforcement Learning Techniques Via Modularity,”
Machine Learning Proceedings
,
Elsevier
, pp.
368
372
.
77.
Uchibe
,
E.
,
Asada
,
M.
, and
Hosoda
,
K.
,
1996
, “
Behavior Coordination for a Mobile Robot Using Modular Reinforcement Learning
,”
Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS’96
,
Osaka, Japan
.
78.
Barto
,
A. G.
, and
Mahadevan
,
S.
,
2003
, “
Recent Advances in Hierarchical Reinforcement Learning
,”
Discrete Event Dyn. Syst.
,
13
(
1–2
), pp.
41
77
. 10.1023/A:1022140919877
79.
Singh
,
S. P.
,
1992
,
The Efficient Learning of Multiple Task Sequences
,
MIT Press
,
Cambridge
, pp.
251
258
.
80.
Russell
,
S. J.
, and
Zimdars
,
A.
,
2003
, “
Q-decomposition for Reinforcement Learning Agents
,”
Proceedings of the 20th International Conference on Machine Learning (ICML-03)
,
Menlo Park, CA
, pp.
656
663
.
81.
Sprague
,
N.
, and
Ballard
,
D.
,
2003
, Multiple-Goal Reinforcement Learning with Modular Sarsa.
82.
Simpkins
,
C.
, and
Isbell
,
C.
,
2019
, “
Composable Modular Reinforcement Learning
,”
Proceedings of the AAAI Conference on Artificial Intelligence
,
Vancouver, BC, Canada
83.
Frans
,
K.
,
Ho
,
J.
,
Chen
,
X.
,
Abbeel
,
P.
, and
Schulman
,
J.
,
2017
, “Meta learning shared hierarchies. arXiv preprint arXiv:1710.09767.
84.
Andreas
,
J.
,
Rohrbach
,
M.
,
Darrell
,
T.
, and
Klein
,
D.
,
2016
, “
Neural Module Networks
,”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
Las Vegas, NV
, pp.
39
48
.
85.
Chitnis
,
R.
,
Kaelbling
,
L. P.
, and
Lozano-Pérez
,
T.
,
2019
, “
Learning Quickly to Plan Quickly Using Modular Meta-learning
,”
International Conference on Robotics and Automation (ICRA)
,
Montreal, Canada
, IEEE, pp.
7865
7871
.
86.
Devin
,
C.
,
Gupta
,
A.
,
Darrell
,
T.
,
Abbeel
,
P.
, and
Levine
,
S.
,
2017
, “
Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer
,”
IEEE International Conference on Robotics and Automation (ICRA)
,
Marina Bay Sands, Singapore
, IEEE, pp.
2169
2176
.
87.
Vecerik
,
M.
,
Sushkov
,
O.
,
Barker
,
D.
,
Rothorl
,
T.
,
Hester
,
T.
, and
Scholz
,
J.
,
2019
, “
A Practical Approach to Insertion with Variable Socket Position Using Deep Reinforcement Learning
,” arXiv:1810.01531v2, pp.
754
760
.
88.
Heydari
,
B.
,
Mosleh
,
M.
, and
Dalili
,
K.
,
2016
, From Modular to Distributed Open Architectures: A Unified Decision Framework, May 2016.
89.
Uhlenbeck
,
G. E.
, and
Ornstein
,
L. S.
,
1930
, “
On the Theory of the Brownian Motion
,”
Phys. Rev.
,
36
(
5
), p.
823
. 10.1103/PhysRev.36.823
90.
Roboti-LLC
. MuJoCo Physics Simulator, 2019.
91.
MHI-Deloitte: MHI Annual Industry Report
. Technical report, The Material Handling Institute, 2019.
92.
RightHand_Robotics. Understanding the 3Rs of Robotic Piece-Picking, 2018.
You do not currently have access to this content.